Risk excess measures induced by hemimetrics
 Olivier P. Faugeras^{1}Email author and
 Ludger Rüschendorf^{2}
https://doi.org/10.1186/s4154601800320
© The Author(s) 2018
Received: 26 September 2017
Accepted: 7 May 2018
Published: 5 June 2018
Abstract
The main aim of this paper is to introduce the notion of risk excess measure, to analyze its properties, and to describe some basic construction methods. To compare the risk excess of one distribution Q w.r.t. a given risk distribution P, we apply the concept of hemimetrics on the space of probability measures. This view of risk comparison has a natural basis in the extension of orderings and hemimetrics on the underlying space to the level of probability measures. Basic examples of these kind of extensions are induced by mass transportation and by function class induced orderings. Our view towards measuring risk excess adds to the usually considered method to compare risks of Q and P by the values ρ(Q), ρ(P) of a risk measure ρ. We argue that the difference ρ(Q)−ρ(P) neglects relevant aspects of the risk excess which are adequately described by the new notion of risk excess measure. We derive various concrete classes of risk excess measures and discuss corresponding ordering and measure extension properties.
Keywords
AMS Subject Classification
1 Introduction
1.1 Motivation
is the induced comparison of risks (where x_{+}= max(x,0) denotes the positive part of x).
1.2 Outline
We discuss several classes of risk excess measures D_{+}(Q,P) and consider the question when these are given as order extensions of hemidistances d_{+} on the underlying space E. Several relevant hemidistances are induced by mass transportation and thus give access to natural interpretation. One particular extension is given by a version of the Kantorovich–Rubinstein theorem for hemidistances. The paper develops basic tools and notions for measuring the onesided risk excess of a risk distribution Q compared to P.
The paper is organized as follows: in Section 2, we introduce the notion of hemimetrics which are basic for obtaining a quantitative description of onesided distance in a preordered space (E,≤). The risk excess measure D_{+}(Q,P) of Q w.r.t. P is then introduced as a onesided hemimetric on the space of probability measures \(\mathcal {M}^{1}(E)\). The ordering ≼ on \(\mathcal {M}^{1}(E)\) is chosen consistent with the preorder ≤ on E and describing a positive risk excess, i.e., Q≼P if Q has no positive risk excess w.r.t. P. We discuss several examples to describe the meaning of this notion and the interplay of order and distance.
In Section 3, we study several classes of interesting risk excess comparison measures and corresponding extension properties of the preorderings on the underlying space. A general class of risk comparison measures is introduced by considering worstcase comparison over suitable classes of increasing functions. This is analog to the worstcase representation of convex and coherent risk measures. There are several classes of examples.
In Section 4, we describe risk excess measures D_{+}(X,Y) on the space of random variables. The class of compound risk excess measures is obtained for those measures which depend only on the joint law of the random elements (X,Y). Mass transportation gives a natural way to obtain minimal extensions of compound risk excess measures to risk excess measures in the space of distributions, i.e., which depend only on the marginal laws of X and Y. Dual representations of these risk excess measures are obtained by a version of the Kantorovich–Rubinstein theorem for hemimetrics. Several examples illustrate these constructions.
In Section 5, we introduce the concept of weak risk excess measure, which is a risk excess measure without the weak identity property. Similarly to Section 4, a mass transportation formulation gives a way to obtain weak risk excess measures as the maximal extension of compound risk excess measures. We also give a dual representation of this risk excess measure and introduce several examples of weak excess risk measures constructed from mass transportation problems.
Finally, in Section 6, we consider dependence restrictions on the class of risk pairs (X,Y) and consider maximal and minimal excess risks with these restrictions. These maximal and minimal excess risks do not define risk excess measures, but give relevant and wellmotivated bounds. For one and twosided restrictions, we obtain explicit formulas for the bounds.
2 Hemimetrics and measuring risk excess
2.1 Hemimetrics
As a motivation for the introduction of measuring the risk excess of distributions, one could argue that, from the structural and phenomenological point of view, the concept of risk combines aspects of the metric structure (a risk measure evaluates some “size” or “norm” on the space of distributions) and of the order structure (there is an underlying preorder structure on the space of distributions which allows one to say when one risk is larger than another). Such “quantitative measure of the order” is encapsulated in the notion of hemimetric, see GoubaultLarrecq (2013)in Chap. 6, p. 203. (The terminology is not completely standard and the notion of hemimetric is also known of as pseudo quasimetric in the topology literature, while Nachbin (1965)in p. 61 calls it a semimetric). We use the following definition:
Definition 1

positivity: d_{+}(x,y)≥0;

weak identity: x=y⇒d_{+}(x,y)=0;

triangle inequality: d_{+}(x,z)≤d_{+}(x,y)+d_{+}(y,z).
The main difference with the notion of metric is the omittance of the symmetry condition, and assuming only the weak identity property. For establishing a connection with a preorder ≤ on E, we introduce the notion of a onesided hemimetric.
Definition 2

x≤y⇔d_{+}(x,y)=0.
For two comparable elements, the onesided hemimetric of a smaller element x to a larger element y is zero.
Remark 1

If E is a set and d_{+} a hemimetric on E, one can endow E with a preorder structure by setting$$ x\le y \Leftrightarrow d_{+}(x,y)=0. $$(5)
Then, by construction of ≤, we obtain that d_{+} is a onesided hemimetric on E.

Heminorms and hemimetrics:
When E has a vector space structure, a metric d can be induced in a natural way by a norm ρ, as d(x,y):=ρ(x−y). Similarly, a heminorm ρ_{+} on E, (i.e., a subadditive, positive homogeneous, nonnegative functional \(\rho _{+}:E\to \overline {\mathbb {R}}\) satisfying the weak separation condition x=0_{ E }⇒ρ_{+}(x)=0) defines a hemimetric d_{+} by setting$$ d_{+}(x,y):=\rho_{+}(xy). $$(6)In addition, if E has a preorder ≤ and ρ_{+} is a heminorm which has the property that$$ x\le 0_{E} \Leftrightarrow \rho_{+}(x)=0, $$(7)then d_{+} in (6) defines a onesided hemimetric.
More generally, if (E,≤,ρ) is a latticeordered normed vector space, one can construct a onesided hemimetric compatible with ≤ by settingwhere ∨ is the least upper bound operation.$$ d_{+}(x,y):=\rho((xy)\vee 0_{E}), $$ 
To any hemimetric d_{+} on E, one can associate its dual hemimetric d_{−}, obtained by symmetrization of d_{+},$$ d_{}(x,y):=d_{+}(y,x). $$(8)
When d_{+} is a onesided hemimetric associated with the order ≤ on E, d_{−} is a onesided hemimetric associated with the corresponding dual order ≥ on E.
A hemimetric d^{+} induces a distance d by symmetrizationor by taking the positive linear combination, say$$d^{\infty}(x,y):=\max(d_{+}(x,y),d_{}(x,y)),$$More generally, a hemimetric allows defining a “onesided” topology by setting the open balls as$$d^{1}(x,y):=\alpha d_{+}(x,y)+\beta d_{}(x,y), \quad \alpha,\beta >0.$$$$ B^{+}(x,r):=\{y\in\mathcal X, d_{+}(x,y)<r\}. $$(9) 
The concept of a hemimetric is implicit in several notions encountered in analysis, probability, and statistics. For example, recall that a realvalued function f on a metric space (E,d) is upper semicontinuous at x_{0} iffwhere \(d_{+}^{b}(x,y):=\rho _{+}(xy)=\max (xy,0)\) is the usual basic onesided hemimetric on \((\mathbb {R},\le,.)\) (see Example 3 and (13) below).$$ \forall \epsilon>0, \exists \delta>0, d(x,x_{0})\le\delta\Rightarrow d_{+}^{b}(f(x),f(x_{0}))\le \epsilon, $$
2.2 Risk excess measures
After the discussion of hemimetrics, we are now in a position to introduce the main object of this paper, which is a measure of the risk excess of a distribution Q w.r.t. P. To that aim, we assume that a preorder ≼ is defined on the set \(\mathcal {M}^{1}(E)\) of probability measures on a measurable space \((E,\mathcal {E})\): P≼Q describes that Q has more risk than P.
Definition 3
(Risk excess measure) A risk excess measure D_{+} is defined as an onesided hemimetric on the preordered space \(\left (\mathcal {M}^{1}(E),{\preceq }\right)\), (or on a subset \(\mathcal {M}\subset \mathcal {M}^{1}(E)\)). D^{+}(Q,P) is called the risk excess of Q w.r.t. P.
We illustrate below this concept with the following examples. A general class of risk excess measure will be presented in a systematic way in Section 3.
Example 1
(Stochastic ordering) On \( E=\mathbb {R}^{d}\), we consider the componentwise order ≤, which is closely connected with the stochastic order ≼_{ st }: for a measurable set B⊂E, define B^{ ↑ }={y∈E; ∃x∈B s.t. y≥x} and say that B is an increasing set if B=B^{ ↑ }. Denote by \(\mathcal {I}(E)\) the set of measurable increasing sets of E.
By the wellknown Strassen theorem (see Strassen (1965) and e.g., Rüschendorf (2013) in Theorem 1.18, p. 22), this is equivalent to the existence of random vectors X∼Q, Y∼P s.t. X≤Y a.s.
In other words, the distribution Q is considered more safe than P if one can construct representations X of Q and Y of P s.t. all coordinates of X are lower than those of Y. Q has a positive risk excess w.r.t. P if some of the components of any representation X of Q exceed the corresponding components of any representation Y of P. Of course, this gives a very strict notion of no risk excess.
Example 2
where A^{ ε }:={x∈E:∃a∈A,d_{+}(a,x)<ε}=∪_{x∈A}B^{+}(x,ε). Then, \(D_{+}^{LP}\) is a onesided risk excess measure and \(D_{+}^{LP}(Q,P)=0\) iff Q(A)≤P(A) for all \(A\in \mathcal {E}\).
One can replace A^{ ε } by A^{ε]}:={x∈E:∃a∈A,d_{+}(a,x)≤ε}, and the open sets by the closed set in the definition (11), see Dudley (1968), Dudley (1976) in sect. 8, Dudley (2002) in Chap. 11.3. For the onesidedness, if Q(A)≤P(A) for all \(A\in \mathcal {E}\), then, for every ε>0, Q(A)≤P(A)≤P(A^{ ε })+ε, since A⊂A^{ ε }. Hence, \(D_{+}^{LP}(Q,P)\le \epsilon \). Letting ε↓0 yields \(D_{+}^{LP}(Q,P)=0\). Conversely, if \(D_{+}^{LP}(Q,P)=0\), there exists a sequence ε_{ n }↓0 s.t. for all closed sets A, \(\phantom {\dot {i}\!}Q(A)\le P(A^{\epsilon _{n}})+\epsilon _{n}\). Since \(A^{\epsilon _{n}}\downarrow \overline {A}=A\), this yields Q(A)≤P(A) for all closed sets A. Hence, Q(A)≤P(A) also for all \(A\in \mathcal {E}\).
2.3 Examples of hemimetrics
Hemimetrics are suitable tools to measure onesided distances. We illustrate the meaning of this notion and the interplay of order and distance via the following example, which will be used constantly throughout the paper.
Example 3

Discrete onesided hemimetric:
Let (E,≤) be a preordered space, then$$ d_{+}^{\le}(x, y)=\left\{\begin{array}{ll} 0&\text{if} x\le y\\ 1&\text{else} \end{array}\right. $$(12)defines a onesided hemimetric on (E,≤), which we call the discrete onesided hemimetric on (E,≤).

l^{ p } hemimetric:
On \(E=\mathbb {R}^{1}\), one can decompose the absolute value into its positive and negative parts x=x^{+}+x^{−}=ρ_{+}(x)−ρ_{+}(−x), viz., into two heminorms satisfying (7). As a consequence of (6), the metricis decomposed as a sum of two onesided hemimetrics (d_{+},d_{−}) associated with the dual orders (≤,≥). The basic onesided hemimetric$$xy=d_{+}(x,y)d_{+}(y,x)=d_{+}(x,y)+d_{}(x,y)$$$$ d_{+}^{b}(x,y):=(xy)_{+} $$(13)describes in a quantitative way the ordering relationship ≤. Compared to the discrete hemimetric (12), it also contains information on the magnitude of the onesided departure of two elements.
Similarly on \((E,\le)=(\mathbb {R}^{d},\le)\) supplied with the componentwise (product) orderthe l^{ p } heminorms, defined as$$\mathbf{x\le y}\Leftrightarrow x_{i}\le y_{i}, 1\le i\le d,$$$$\begin{array}{@{}rcl@{}} l^{p}_{+}(\mathbf{x})&:=&\left(\sum_{i=1}^{d} \left(x_{i}^{+}\right)^{p}\right)^{1/p}, \quad 1\le p<\infty, \\ l_{+}^{\infty}(\mathbf{x})&:=&\max \left\{x_{i}^{+}\right\} \end{array} $$(14)induce the onesided l^{ p } hemimetrics$$d_{+}^{p}(\mathbf{x, y}):=l^{p}_{+}(\mathbf{xy}), \quad 1\le p\le \infty.$$
Several of the hemimetrics have a direct interpretation and extensions as risk measures for probability distributions. We give two examples:
Example 4

τ−quantiles:
Consider on the real line \(E=\mathbb {R}^{1}\), the heminorm$$ \rho_{\tau}(x):=\tau x^{+}+(1\tau)x^{}=\tau x^{+}+(1\tau)(x)^{+},\quad 0<\tau <1 $$(15)induces, by Remark 1 and (6), a hemimetric$$ d_{\tau}(x,y):=\rho_{\tau}(xy). $$(16)It is well known that this hemimetric can be used to define τ−quantiles q_{ τ }(Y) (viz., the Value at Risk) of a random variable Y as a minimizer of E[ρ_{ τ }(Y−y)], i.e.,$$\begin{array}{@{}rcl@{}} q_{\tau}(Y)&:=&F^{1}_{Y}(\tau)=\arg\inf_{y} E\left[\rho_{\tau}(Yy)\right] \end{array} $$(17)$$\begin{array}{@{}rcl@{}} &=&\arg\inf_{y} E[d_{\tau}(Y,y)]={VaR}_{\tau}(Y), \end{array} $$(18)see Koenker (2005) in p. 5. Note, however, that the order induced by d_{ τ } reduces to the trivial order =, as d_{ τ }(x,y)=0 iff x=y.

Halfspace depth, departure in direction u:
A multivariate generalization of the preceding example can be defined as follows. On \(E=\mathbb {R}^{d}\), we define for any unit vector u an ordering (the length in the direction u), by$$ \mathbf{x}\le_{\mathbf{u}} \mathbf{y} \Leftrightarrow \mathbf{u}^{T} (\mathbf{y}\mathbf{x})\ge 0, $$(19)where x^{ T } denotes the transpose of x. With this ordering,$$ d_{+}^{\mathbf{u}}(\mathbf{x},\mathbf{y})=\left\{\begin{array}{ll} 1&\text{if} \quad\mathbf{u}^{T} (\mathbf{y}\mathbf{x})> 0\\ 0&\text{else} \end{array}\right. $$(20)defines, as in (12), a onesided hemimetric. It is one if the length of y in direction u is greater than that of x, and is zero else.
This onesided hemimetric has, as basic application, the definition of the halfspace depth function, which describes the degree of outlyingness of a point \(\mathbf {x}\in \mathbb {R}^{d}\) w.r.t. a probability measure P on \(\mathbb {R}^{d}\). It is defined as$$\begin{array}{@{}rcl@{}} D_{+}(x,P)&:=&\inf_{\mathbf{u}\in S_{d1}}\int d_{+}^{\mathbf{u}}(x,y)dP(y)\\ &=&\inf_{\mathbf{u}\in S_{d1}}\int 1_{\{\mathbf{u}^{T}(\mathbf{y}\mathbf{x})> 0\}}dP(y), \end{array} $$(21)where S_{d−1} is the unit sphere of \(\mathbb {R}^{d}\). Several modifications of this definition are useful to describe a onesided degree of outlyingness (or risk) or quantitative versions of it. Two relevant examples are$$ D_{+}^{1}(x,P):=\inf_{\mathbf{u}\in S^{+}_{d1}}\int1_{\left\{\mathbf{u}^{T}(\mathbf{y}\mathbf{x})> 0\right\}}dP(y), $$(22)or$$ D_{+}^{2}(x,P):=\inf_{\mathbf{u}\in S^{+}_{d1}}\int\left(\mathbf{u}^{T}(\mathbf{y}\mathbf{x})\right)^{+}dP(y), $$where \(S_{d1}^{+}=S_{d1}\cap \mathbb {R}^{d,+}\) is the part of the unit sphere in the positive cone x≥0. We mention that a very general approach to multivariate quantiles can be found in Faugeras and Rüschendorf (2017).
At last, we briefly mention some examples of onesided hemimetrics which may appear in related contexts.
Example 5

Schurorder ≤_{ S } on \(\mathbb {R}^{d}\):
The majorization, or Schur order ≤_{ S }, is useful to compare vectors \(\mathbf {x,y}\in \mathbb {R}^{d}\) with identical sums w.r.t. their degree of dispersion, see e.g., Marshall et al. (2011). In a natural way, this ordering extends to an ordering on \(\mathcal {M}^{1}(\mathbb {R}^{d})\), comparing the relative degree of dispersions of two measures. Let \(\mathbf {x, y}\in \mathbb {R}^{d}\), Γ(d) the set of permutations of {1,…,d}. The Schurordering on \(\mathbb {R}^{d}\) x≤_{ S }y is defined by,$$\begin{array}{@{}rcl@{}} \sum_{k=l}^{d} x_{\gamma(k)}&\le& \sum_{k=l}^{d} y_{\beta(k)}, \quad l=2,\ldots, d,\\ \sum_{k=1}^{d} x_{\gamma(k)}&=& \sum_{k=1}^{d} y_{\beta(k)} \end{array} $$(23)where γ,β∈Γ(d) are the decreasing rearrangements of x and y:≤_{ S } is a preorder: x≤_{ S }y and y≤_{ S }x only imply that the components of each vector are equal, but not necessarily in the same order. Geometrically, x≤_{ S }y if and only if x is in the convex hull of all vectors obtained by permuting the coordinates of y. When x,y stands for a pair of discrete probability measures on the same set of dpoints, the norming condition (23) is satisfied as the sum is normalized to one.$$ x_{\gamma(1)}\ge x_{\gamma(2)}\ge \ldots\ge x_{\gamma(d)}, \quad y_{\beta(1)}\ge y_{\beta(2)}\ge \ldots\ge y_{\beta(d)}. $$Say that x and y are Schurcomparable if \( \sum _{i=1}^{n} x_{i}=\sum _{i=1}^{n} y_{i}\). The degree of dispersion is measured by the following onesided hemimetric: for Schurcomparable elements x,y, defineOne has, for Schurcomparable elements:$$d_{+}(\mathbf{x}, \mathbf{y}):=\sup_{l=2,\ldots, d}\left(\sum_{k=l}^{d} [x_{\gamma(k)}y_{\beta(k)}]\right)_{+}.$$Specialized to discrete probability measures, this gives a onesided hemimetric measuring the degree of dispersion or “variance”.$$\mathbf{x}\le_{S} \mathbf{y} \text{ iff} d_{+}(\mathbf{x},\mathbf{y})=0.$$ 
Onesided Hausdorff hemimetric on closed subsets:
Let (E,d) be a metric space. Set$$ d_{+}(A,B):=\underset{y\in A\, x\in B}{\sup\inf}\ d(x,y). $$(24)Then, for closed sets A,B, it holds that d_{+}(A,B)=0⇔A⊂B, and d_{+} is a onesided hemimetric on \((\mathcal {C}(E),\subset)\), the set of closed subsets of E.
3 Risk excess measures induced by function classes
3.1 Motivation and definition
Then, \(D^{\mathcal {F}}_{+}\) is a risk excess measure on \(\left (\mathcal {M}^{\mathcal {F}},\preceq _{\mathcal {F}}\right)\).
Definition 4
(\(\mathcal {F}\)induced risk excess measure) The risk excess measure \(D^{\mathcal {F}}_{+}\) on \(\left (\mathcal {M}^{\mathcal {F}},\preceq _{\mathcal {F}}\right)\) defined in (26) is called the \(\mathcal {F}\)induced risk excess measure.
Example 6
Example 1 can be regarded as an \(\mathcal {F}\)induced excess risk measure, by considering \(\mathcal {F}=\{1_{B}:B\in \mathcal {I}(E)\}\).
Remark 2
3.2 Extension and restrictions of orders and hemimetrics
For risk excess measures, an important aspect is to have a kind of consistency w.r.t. some ordering ≤ on E, i.e., \(\mathcal {F}\) consists of increasing functions w.r.t. ≤. In this respect, the following order extension properties are useful.
Proposition 1

If ≼ is a preorder on \(\mathcal {M}^{1}(E)\), then, the relation ≤_{ r }, defined, for x,y∈E, by$$ x\le_{r} y \Leftrightarrow \delta_{x}\preceq\delta_{y}, $$(28)
defines a preorder on E. ≤_{ r } is called the restriction of the preorder ≼ on \(\mathcal {M}^{1}(E)\).

Conversely, if ≤ is a preorder on E, then the stochastic order ≼_{ st } defines a partial order on \(\mathcal {M}^{1}(E)\), such that its restriction ≤_{ r } is identical to ≤.
Proof

The proof follows by direct verification.

By definition, we have$$\begin{array}{@{}rcl@{}} x\le_{r} y &\Leftrightarrow&\delta_{x}\preceq_{st}\delta_{y} \Leftrightarrow 1_{B}(x)\le 1_{B}(y), \forall B\in\mathcal{I}(E)\\ &\Leftrightarrow& [x\in B\Rightarrow y\in B, \forall B\in\mathcal{I}(E)]. \end{array} $$(29)
Remark 3
For a closed partial order ≤ on a Polish space E, the result follows directly from Strassen theorem (see Example 1).
Proposition 2

If D_{+} is a risk excess measure on \(\left (\mathcal {M}^{1}(E),\preceq \right)\), thendefines a onesided hemimetric on (E,≤_{ r }), called the restriction of D_{+} on E.$$d_{+}^{r}(x,y):=D_{+}(\delta_{x},\delta_{y}) $$

If \(d_{+}^{\le }\) is the discrete hemimetric on (E,≤) of (12), then \(D_{+}^{st}\) is an extension of \(d_{+}^{\le }\) into a risk excess measure on (M^{1}(E),≼_{ st }) such that the restriction \(d_{+}^{r}\) of \(D_{+}^{st}\) is equal to \(d_{+}^{\le }\).
Proof

The proof follows by direct verification and Proposition 1.

The restriction of \(D_{+}^{st}\) to E writeswhich is {0,1}−valued and a onesided hemimetric on E by Proposition 2 part 1. By Proposition 1 part 2,$$ d_{+}^{r}(x,y):=D_{+}^{st}(\delta_{x},\delta_{y})=\sup\{\left(1_{B}(x)1_{B}(y)\right)_{+}; B\in\mathcal{I}(E)\}, $$Therefore, \(d_{+}^{r}(x,y)=1_{x\nleq y}=d^{\le }_{+}(x,y).\)$$ d_{+}^{r}(x,y)=0\Leftrightarrow x\le_{r} y \Leftrightarrow x\le y. $$
Remark 4
The construction of the previous proposition, based on the \(D_{+}^{st}\) of Example 1, which encodes the order ≤ into ≼_{ st }, is consistent w.r.t. the order ≤, in the sense that the restriction of \(D_{+}^{st}\) is the discrete onesided hemimetric \(d_{+}^{r}=d^{\le }_{+}\), which encodes the original order ≤. However, for a onesided hemimetric d_{+} on (E,≤) different from the discrete one, the extention \(D_{+}^{st}\) is in general inconsistent w.r.t. the hemimetric d_{+}, in the sense that the restriction of the risk excess measure \(D_{+}^{st}\) is not the original d_{+} but is again the discrete onesided hemimetric \(d_{+}^{\le }\). This is illustrated in the following diagram:
The question of consistently extending/restricting a onesided hemimetric d_{+} into a risk excess measure D_{+}, according to the diagram,
will be treated by mass transportation in Section 4.
It is interesting to observe that, in general, there may exist many extensions of a onesided hemimetric on E to a risk excess measure on \(\mathcal {M}^{1}(E)\), as seen in the following example. We will discuss some general extensions in Section 4.
Example 7
As a consequence, both risk excess measures \(D_{+}^{{uo}}\) and \(D_{+}^{{st}}\) of Example 1 induce the same componentwise ordering ≤ on \(E=\mathbb {R}^{d}\) and also induce the same restriction as hemimetric on E. \(D_{+}^{uo}\) and \(D_{+}^{st}\) are both extensions of the same discrete onesided hemimetric \(d_{+}^{\le }\) on E from Example 3 (a), as is illustrated in the diagram below:
Example 8
4 Risk excess measures for random variables and minimal extension by mass transportation
4.1 Compound risk excess measures
So far we have considered risk excess measures as onesided hemimetrics on the space of probability distributions, i.e., as a mapping \(D_{+}:\mathcal {M}\times \mathcal {M}\mapsto [0,\infty ]\), for \(\mathcal {M}\subset \mathcal {M}^{1}(E)\), acting on a pair (Q,P) of probability measures on E. Like for risk measures \(\rho :\mathfrak {X}\mapsto \mathbb {R}\) defined on a space of random variables \(\mathfrak {X}\subset \mathfrak {L}^{0}_{E}=\mathfrak {L}^{0}_{E}(\Omega,\mathcal {A},\mu):=\{X:\Omega \to E\}\) (see e.g., Föllmer and Schied (2002)), it is natural to define risk excess measures \(D_{+}:\mathfrak {X}\times \mathfrak {X}\mapsto \mathbb {R}\), also on a space \(\mathfrak {X}\) of random variables.
This allows to consider the risk of a random element X∈E as a relative property: there is a joint modeling of the vector \((X,Y)\in \mathfrak {X}^{2}\), defined on a common probability space \((\Omega,\mathcal { A}, \mu)\), so that the risk of X:Ω↦E can be considered in relation to the random element Y:Ω↦E, regarded as a benchmark. In the context of insurance and financial mathematics, Y can stand for the value of an alternative portfolio, of a hedge, of a market indicator, or the wealth of an insurer. For example, an insurer, facing the prospect of losing a claim amount X, may wish to evaluate its perceived risk with respect to its reserve capital Y: the ”risk” X does not have the same potential consequences whether Y is small or large compared to X. In the same vein of reasoning, because of the fluctuating and (usually) inflating nature of fiat money in the post1973, petrodollar based, current monetary system, one may be interested in evaluating the value of a financial asset X w.r.t. the price of a commodity Y considered as a standard, like gold or oil, whose supply is limited in essence.
For \(\mathfrak {X}\subset \mathfrak {L}^{0}_{E}=\mathfrak {L}^{0}_{E}(\Omega,\mathcal {A},\mu)\) a set of random variables on \((\Omega,\mathcal {A},\mu)\) with values in (E,≤), we consider the pointwise ordering on \(\mathfrak {X}\) induced by ≤. We identify random elements in \(\mathfrak {L}^{0}_{E}\) which are identical a.s. and similarly X≤Y means that X≤Y μa.s.
Definition 5
(Risk excess measure on \(\mathfrak {X}\)) For \(\mathfrak {X}\subset \mathfrak {L}^{0}_{E}\), a risk excess measure D_{+} on \(\mathfrak {X}\) is a onesided hemimetric on \(\mathfrak {X}\).
Definition 6
(Compound risk excess measure on \(\mathfrak {X}\)) A risk excess measure \(D_{+}^{c}\) on \(\mathfrak {X}\) is called a compound risk excess measure on \(\mathfrak {X}\) if \(D_{+}^{c}(X,Y)\) depends only on the joint distribution μ^{(X,Y)} of (X,Y).
Example 9

An example of a risk excess measure on \(\mathfrak {X}\) which is not compound isHowever, since random elements in \(\mathfrak {L}^{0}_{E}\) which are identical μa.s are identified, it is natural to consider only compound risk excess measure, e.g., the essential supremum version$$ D_{+}(X,Y):=\sup_{\omega\in \Omega}(X(\omega)Y(\omega))_{+}. $$instead.$$ D_{+}(X,Y):=\text{esssup}_{\mu}(XY)_{+} $$

On \((\Omega,\mathcal {A},\mu)\), let \(A_{0}\in \mathcal {A}\), with 0<μ(A_{0})<1, be a class of scenarios considered as “low risk”, while its complement A_{1}:=Ω∖A_{0} is considered as “high risk”. Then, for some safety coefficient α>1,with \(\text {esssup}_{\mu,A}(XY)_{+}:=\inf \{c\in \mathbb {R};\mu ((XY)_{+}\ge c)\cap A)=0\}\), or$$ D_{+}(X,Y):=\text{esssup}_{\mu,A_{0}}(XY)_{+}+{\alpha}\, \text{esssup}_{\mu,A_{1}}(XY)_{+}, $$define noncompound risk excess measures, which values α times more the risk excess (X−Y)_{+} for the high risk scenarios than for the low risk ones.$$ D_{+}(X,Y):=\int_{A_{0}}(XY)_{+}d\mu+\alpha\int_{A_{1}}(XY)_{+}d\mu, $$
Remark 5

The notation \(D_{+}^{c}\) in Definition 6 stresses that \(D_{+}^{c}\) depends on the joint distribution μ^{(X,Y)} and not solely on the marginals μ^{ X },μ^{ Y } of (X,Y), as is the case in Definition 3. See also Zolotarev (1997, Rachev (1991) for the similar notion of compound probability metric. For risk measures ρ(X) on \(\mathfrak X\), there is the analog notion of lawinvariant risk measures which depend only on the law μ^{ X } of the random variable.

There are two main reasons why compound risk measures on \(\mathfrak {X}\) are of particular importance. Firstly, they allow to define extensions as excess risk measures \(D_{+}:\mathcal {M}\times \mathcal {M}\to [0,\infty ]\) on subclasses \(\mathcal {M}\subset \mathcal {M}^{1}(E)\) defined by the induced set of distributions of elements of \(\mathfrak {X}\) (see Section 4.3). Secondly, the fact that they depend only on the joint distribution μ^{(X,Y)} induces the possibility of statistical estimation of the risk excess D_{+}(X,Y) by their empirical analogs. This property is most relevant for the application of risk excess measures.

Like in the case of probability metrics, it is also possible to describe compound risk excess measures formally on the subclass \(\mathcal {M}^{(2)}\) of bivariate laws μ^{(X,Y)} for \(X,Y\in \mathfrak {X}\). For details in the case of probability metrics, see Rachev (1991).
4.2 Construction of a compound risk excess measure from a onesided hemimetric d _{+} on E
Note that (31) depends only on the joint distribution of (X,Y): it is indeed a compound risk excess measure defined on a space \(\mathfrak {X}\) of random variables.
Indeed, one has:
Proposition 3
For any measurable onesided hemimetric d_{+} on (E,≤), (31) defines a finite onesided compound risk excess measure on \(\mathfrak {X}\).
Proof
Remark 6
4.3 Minimal extension of a compound risk excess measure
A compound risk excess measure \(D_{+}^{c}\), depending on the joint distribution μ^{(X,Y)}, can be turned by mass transportation into a risk excess measure on \(\mathcal {M}^{1}(E)\), i.e., depending only on the pair of marginals μ^{ X },μ^{ Y }, where \(\mathcal {M}^{1}(E)\) is supplied with the stochastic ordering ≼_{ st } consistent with the underlying order ≤ on \(\mathfrak {X}\).
Definition 7
The fact that \(D_{+}^{inf}\) is indeed a onesided risk excess measure on the space of probability measures is given in the following proposition:
Proposition 4

If (E,≤) is a Polish space with a closed partial order, and if \(D_{+}^{c}\) is weakly lowersemicontinuous, in the sense that$$ (X_{n},Y_{n})\stackrel{d}{\to}(X,Y)\Rightarrow D_{+}^{c}(X,Y)\le \liminf D_{+}^{c}(X_{n},Y_{n}), $$(33)
then \(D^{inf}_{+}\) is a onesided risk excess measure on \((\mathcal {M}^{1}(E),\preceq _{st})\), where ≼_{ st } is the stochastic order.

If \(D_{+}^{c}(X,Y)={Ed}_{+}(X,Y)\), as in (31), for d_{+} a lower semi continuous onesided hemimetric on (E,≤), then \(D^{inf}_{+}\) is a onesided risk excess measure on \(\left (\mathcal {M}^{1}(E),\preceq _{st}\right)\).
Proof

(A1) is obvious. (A2) follows from the fact that \(D_{+}^{c}\) satisfies (A2): for X∼Q, \(0 \le D^{inf}_{+}(Q,Q)\le D^{c}_{+}(X,X)=0\). Regarding (A3): for \((\Omega,\mathcal {A},\mu)\) a nonatomic probability space and E a Polish space, any bivariate measure \(\alpha \in \mathcal {M}^{1}(E^{2})\) can be obtained as the image measure of μ by some measurable mapping, see e.g., Berkes and Philipp (1979). Therefore, for all ε>0, there exists random variables (X,Y_{1})∼α=α_{ QP }, where \(\alpha \in \mathcal {M}^{1}\left (E^{2}\right)\) has marginals Q,P and there exists random variables (Y_{2},Z)∼β=β_{ PR } with marginals P,R s.t.By the gluing lemma, see e.g., Villani (2003) in p. 208, there exists a trivariate measure γ=γ_{ QPR } s.t. its projection on the first two marginals is α and its projection on the last two marginals is β. In addition, γ can be obtained as the image measure of μ for some measurable mapping. In other words, there exists a joint construction of a random vector \((\tilde X,\tilde Y,\tilde Z)\) on the probability space \((\Omega,\mathcal {A},\mu)\) s.t. \(\mu ^{\tilde X,\tilde Y,\tilde Z}=\gamma \) and$$ D_{+}^{inf}(Q,P)+\frac{\epsilon}{2}\ge D_{+}^{c}(X,Y_{1}),\quad \text{and }D_{+}^{inf}(P,R)+\frac{\epsilon}{2}\ge D_{+}^{c}(Y_{2},Z). $$$$ D_{+}^{inf}(Q,P)+\frac{\epsilon}{2}\ge D_{+}^{c}\left(\mu^{\tilde X,\tilde Y}\right),\quad \text{and }D_{+}^{inf}(P,R)+\frac{\epsilon}{2}\ge D_{+}^{c}\left(\mu^{\tilde Y,\tilde Z}\right). $$(34)By (A3) for the compound risk excess \(D_{+}^{c}\),which gives with (34),$$ D_{+}^{c}\left(\mu^{\tilde X\tilde Z}\right)\le D_{+}^{c}\left(\mu^{\tilde X\tilde Y}\right)+D_{+}^{c}\left(\mu^{\tilde Y\tilde Z}\right) $$Letting ε↓0 gives (A3) for \(D_{+}^{inf}\).$$ D_{+}^{inf}(Q,R)\le D_{+}^{c}\left(\mu^{\tilde X\tilde Z}\right)\le D_{+}^{inf}(Q,P)+D_{+}^{inf}(P,R)+ \epsilon. $$For the onesidedness property (A4), if \(D^{inf}_{+}(Q,P)=0\), then there exists a sequence (X_{ n },Y_{ n }) of random variables on \((\Omega,\mathcal {A}, \mu)\), all with fixed marginals Q,P, s.t. \(D_{+}^{c}(X_{n}, Y_{n})\to 0\). Since \(\mathcal {M}^{1}(Q,P)\) the set of probability measures on E×E with marginals Q,P is weakly compact in \(\mathcal {M}^{1}\left (E^{2}\right)\), one can extract a subsequence n^{′} s.t. \(\phantom {\dot {i}\!}(X_{n'},Y_{n'})\stackrel {d}{\to }(X,Y)\) for some (X,Y) with marginals Q,P. By the assumption on \(D_{+}^{c}\),which entails X≤Y, μa.s. by (A4’). The latter is equivalent to Q≼_{ st }P by Strassen theorem (see Theorem 1.18 in Rüschendorf (2013)). The converse is obvious.$$ D_{+}^{c}(X,Y) \le \liminf D_{+}^{c}(X_{n},Y_{n})=0 $$

If (X_{ n },Y_{ n })→d(X,Y), by Skorohod’s representation theorem, there exists \((\tilde X_{n},\tilde Y_{n})\stackrel {a.s.}{\to }(\tilde X,\tilde Y)\), with \((\tilde X_{n},\tilde Y_{n})\stackrel {d}{=}(X_{n}, Y_{n})\), \((\tilde X,\tilde Y)\stackrel {d}{=}(X, Y)\). Therefore, lower semicontinuity of d_{+} and Fatou’s lemma entails,$$\begin{array}{@{}rcl@{}} D_{+}^{c}(X, Y)&=&{Ed}_{+}(\tilde X,\tilde Y)\le E[\liminf d_{+}(\tilde X_{n},\tilde Y_{n})]\\ &\le& \liminf {Ed}_{+}(\tilde X_{n},\tilde Y_{n}) =\liminf D_{+}^{c}(X_{n},Y_{n}), \end{array} $$
i.e., (33) is satisfied.
4.4 Dual representations of minimal extensions
For a compound excess risk measure \(D_{+}^{c}\) of the kind in (31), the minimal extension \(D_{+}^{inf}\) on \(\mathcal {M}^{1}(E)\) of \(D_{+}^{c}\) by mass transportation, as in (32), admits a representation as a \(\mathcal {F}\)induced risk excess measure, as in (26), which is given by the following Kantorovich–Rubinsteintype theorem for hemimetrics:
Theorem 1
In other words, \(D^{inf}_{+}\) is identical to a \(\mathcal {F}\)induced risk excess measure \(D_{+}^{\mathcal {F}}\)of (26), with \(\mathcal {F}=Lip^{1}_{b}\), the class of bounded Lipschitz functions w.r.t. d_{+}.
Proof

Step one: One has the easy inequality,$$ D_{+}^{Lip^{1}\cap L^{1}}(Q,P)\le D_{+}^{inf}(Q,P). $$(37)Indeed, for all f∈Lip^{1}(d_{+})∩L^{1} and \(\pi \in \mathcal {M}(Q,P)\),$$\begin{array}{@{}rcl@{}} \left(\int f(x)Q(dx)\int f(y)P(dy)\right)_{+}&=&\left(\int (f(x)f(y))\pi(dx,dy)\right)_{+}\\ &\le& \int d_{+}(x,y) \pi(dx,dy). \end{array} $$
Taking the inf on the right and the sup on the left entails the stated inequality (37).

Step two: Kantorovich’s duality, \(D^{inf}_{+}(Q,P)= S(Q,P)=\sup _{\Phi _{d_{+}}}J(f,g)\).
Since d_{+}≥0 is l.s.c., this follows from Rachev and Rüschendorf (1998) in Theorem 2.3.1 (b) or Villani (2003) in Theorem 1.3.

Step three: in view of the first two steps, it remains to show thati.e., that$$D_{+}^{Lip^{1}\cap L_{1}(Q)}(Q,P)\ge D_{+}^{inf}(Q,P),\vspace*{2pt}$$$$\sup_{f\in Lip^{1}\cap L_{1}(Q)} \left(\int fd(QP)\right)_{+}\ge \sup_{\Phi_{d_{+}}}J(f,g).$$
Assume that d_{+} is bounded.
For f continuous bounded, define the d_{+}− convex conjugate of f byOne obviously has f(x)+f^{∗}(y)≤d_{+}(x,y), for all x,y∈E. Therefore, if x↦d_{+}(x,y) is bounded l.s.c. and f∈C_{ b }, then f^{∗} is well defined and bounded.$$ f^{*}(y):=\inf_{x\in E}\{d_{+}(x,y)f(x)\}. $$Moreover, by the triangle inequality, one also hasTaking the infimum on x on both sides yields$$d_{+}(x,y)f(x)\le d_{+}(x,y')+d_{+}(y',y)f(x).$$where d_{−} is the opposite dual hemimetric defined in (8): f^{∗} is d_{−}Lipschitz.$$ f^{*}(y)f^{*}(y')\le d_{+}(y',y)=d_{}(y,y'), $$Note that if f(x)+g(y)≤d_{+}(x,y) for all x,y, then f^{∗}(y)≥g(y).
Define the double conjugate by$$\begin{array}{@{}rcl@{}} f^{**}(x)&:=&\inf_{y\in E}\{d_{+}(x,y)f^{*}(y)\}. \end{array} $$One has f^{∗∗}(x)≥f(x): by definition,$$\begin{array}{@{}rcl@{}} f^{**}(x)&=&\inf_{y\in E}\sup_{x'}\left\{d_{+}(x,y)d_{+}(x',y)+f(x')\right\}\\ &\ge& f(x), \end{array} $$by taking x=x^{′} in the last equation.
Moreover, f^{∗∗} is this time d_{+}Lipschitz: the triangle inequality d_{+}(x,y)−f^{∗}(y)≤d_{+}(x,x^{′})+d_{+}(x^{′},y)−f^{∗}(y) yields, by taking the infimum on y, f^{∗∗}(x)−f^{∗∗}(x^{′})≤d_{+}(x,x^{′}).
We obtain: f^{∗∗}(x)= infy{d_{+}(x,y)−f^{∗}(y)}≤−f^{∗}(x) by taking y=x. On the other hand, since f^{∗} is 1Lipschitz w.r.t. d_{−}, one haswhich yields −f^{∗}(x)≤f^{∗∗}(x). Hence, f^{∗∗}=−f^{∗}.$$^{*}(x)\le d_{+}(x,y){f*}(y),$$Denoting ϕ:=−f^{∗}, and since f^{∗} is d_{−}Lipschitz, ϕ is d_{+}Lipschitz (and bounded thus integrable). In view of all of the above, \((f,g)\in \Phi _{d_{+}}\cap \mathcal {C}_{b}^{2}\) implies \((f^{**},f^{*})\in \Phi _{d_{+}}\) and J(f,g)≤J(f^{∗∗},f^{∗})=J(ϕ,−ϕ). Hence,$$ \sup_{\Phi_{d_{+}}\cap \mathcal{C}_{b}^{2}} J(f,g)\le \sup_{\phi\in Lip^{1}\cap L_{1}(Q)} J(\phi,\phi)\le \sup_{\phi\in Lip^{1}\cap L_{1}(Q)} \left(\int \phi d(QP)\right)_{+}, $$(38)which had to be proved.
Combining (37) with (38), yields the desired result for the case of a bounded hemimetric d_{+}.

Step 4: One can remove the assumption that d_{+} is bounded. For d_{+} a general l.s.c. hemimetric, one can reason as in Villani (2003) in Theorem 1.3, step 3 with \(d^{n}_{+}=d_{+}/(1+n^{1}d_{+})\), so that \(0\le d_{+}^{n}\le d_{+}\) and \(d^{n}_{+}\uparrow d_{+}\) pointwise.
Remark 7
The dual formulation of Theorem 1 gives another proof of the second part of Proposition 4, since the set of increasing bounded Lipschitz functions generates the stochastic order (see the argument in Example 8).
4.5 Examples of minimal risk excess measures
The following propositions give explicit representations of the minimal risk excess measure for several hemimetrics. We first consider the discrete hemimetric \(d_{+}^{\le }\):
Proposition 5

Let \(E=\mathbb {R}^{d}\) be supplied with the (closed) componentwise order ≤. The discrete hemimetric \(d_{+}^{\le }\) of (12) generates, via Proposition 3, the compound risk excess measure$$ D_{+}^{c}(X, Y)=\mu(X\nleq Y). $$(39)This induces, as minimal extension by mass transportation on \(\mathcal {M}^{1}(\mathbb {R}^{d})\), the stochastic ordering onesided risk excess measure of (10):$$ D_{+}^{inf}(Q,P)=D_{+}^{st}(Q,P). $$(40)

A dual representation of (40) is given by$$ D_{+}^{inf}(Q,P)=\sup_{f\uparrow, 0\le f\le 1}\left(\int f d(QP)\right)_{+}. $$(41)
Proof

Since ≤ is a closed order, \(C:=\{(x,y)\in E\times E, x\nleq y\}\) is an open set and \(d_{+}^{\le }(x,y)=1_{C}(x,y)\) is a {0,1}valued l.s.c. function. By Kellerer (1984) and Rüschendorf (1986) in Lemma 1, (see also Villani (2003)) in Theorem 1.27,where A^{ C }:={y∈E,∃x∈A,(x,y)∉C}={y∈E,∃x∈A,x≤y}=A^{ ↑ }. Since A⊂A^{ ↑ },$$ D_{+}^{inf}(Q,P)=\sup\left\{Q(A)P\left(A^{C}\right), A\subset E, A \text{ closed}\right\}, $$$$\begin{array}{@{}rcl@{}} D_{+}^{inf}(Q,P)&=&\sup\left\{Q(A)P\left(A^{\uparrow}\right), A\subset E, A \text{ closed}\right\}\\ &=&\sup\left\{(Q(A)P(A))_{+}, A\in \mathcal{I}(E), A \text{ closed}\right\}=D_{+}^{st}(Q,P).\vspace*{2pt} \end{array} $$

By Kantorovich–Rubinstein Theorem 1,$$\begin{array}{@{}rcl@{}} D_{+}^{inf}(Q,P)&=&\sup_{f\in Lip^{1}(\mathbb{R}^{d},d_{+})}\left(\int f d(QP)\right)_{+}\\ &=&\sup_{f\uparrow, 0\le f\le 1}\left(\int f d(QP)\right)_{+}. \end{array} $$(42)
Note that one can restrict to the set of increasing functions such that 0≤f≤1 by shifting the function by a constant.
on \(\mathfrak {X}\). The corresponding minimal risk excess is given in the following result:
Proposition 6

The minimal extension of (43) to a risk excess measure on \(\mathcal {M}^{1}(\mathbb {R})\) by mass transportation is given by$$\begin{array}{@{}rcl@{}} D_{+}^{inf}(Q,P)&=&\inf_{X\sim Q, Y\sim P}E(XY)_{+}\\ &=&\sup_{f\in Lip^{1}, f\uparrow} \left(\int fd(QP)\right)_{+}=D_{+}^{Lip^{1,\uparrow}}(Q,P), \end{array} $$
where Lip^{1,↑} the class of increasing, 1Lipschitz functions (w.r.t. .).
The ordering induced by \(D_{+}^{inf}\) on \(\mathcal {M}^{1}(\mathbb {R})\) is the stochastic order ≼_{ st }.

One has the following explicit representation:$$ D_{+}^{inf}(Q,P)=E\left(F^{1}(U)G^{1}(U)\right)_{+}, $$(44)
where F,G are the distribution functions of Q,P, and U∼U_{[0,1]} is uniformly distributed on [0,1].
Proof

With the assumption on \(\mathfrak {X}\), Kantorovich–Rubinstein Theorem 1 specializes to$$\begin{array}{@{}rcl@{}} D_{+}^{inf}(Q,P) &=&\sup_{f\in Lip^{1}\left(\mathbb{R}, d_{+}^{b}\right)} \left(\int fd(QP)\right)_{+}. \end{array} $$(45)
Note that \(f\in Lip^{1}\left (\mathbb {R},d_{+}^{b}\right)\) is equivalent to f(y)−f(x)≤(y−x)_{+}, i.e., f increasing and 1Lipschitz w.r.t. the absolute value . norm.
The fact that the order induced by \(D_{+}^{inf}\) on \(\mathcal {M}^{1}(\mathbb {R})\) is the stochastic order ≼_{ st } follows from Proposition 4. Alternatively, a direct proof is as follows: let n≥1 be a positive integer, X∼Q, Y∼P. By Markov’s inequality,Taking the infimum over X∼Q,Y∼P yields that \(D_{+}^{inf}(Q,P)=0\) implies that X−Y<n^{−1} with probability one. Letting n→∞ yields X≤Y a.s. Hence,$$P(XY\ge n^{1})\le P\left((XY)_{+}\ge n^{1}\right)\le nE[(XY)_{+}].$$and the latter is equivalent to Q≼_{ st }P, by Strassen theorem.$$D_{+}^{inf}(Q,P)=0 \text{ iff there exists }X\sim Q, Y\sim P \text{ s.t.} X\le Y \text{ a.s.}$$ 
f(x)=x_{+} is convex, hence f(x−y) is submodular (or quasiantitone in the terminology of Cambanis et al. (1976), or supernegative or 2negative in the terminology of Tchen (1980)). This implies (44) by results of Cambanis et al. (1976) in Theorem 2, or Tchen (1980) in Corollary 2.3 (see also Rüschendorf (2013)).
Remark 8
where V∼U_{(0,1)} is independent of (X,Y), see Rüschendorf (2009). The expected shortfalls are then defined as ES_{ α }(X):=E[XU_{1}≥α], respectively as ES_{ α }(Y):=E[YU_{2}≥α].
where U_{1}, U_{2} are as in (46). We obtain the following result:
Proposition 7

The minimal extension of (47) to a risk excess measure on \(\mathcal {M}^{1}(\mathbb {R})\) by mass transportation has the representation$$\begin{array}{@{}rcl@{}} D_{+}^{\alpha,inf}(Q,P)&:=&\inf_{X\sim Q, Y\sim P} {ED}_{+}^{\alpha,c}(X,Y)\\ &=&E\left[\left(F^{1}(U)G^{1}(U)\right)_{+}1_{U\ge \alpha}\right], \end{array} $$(48)
where U∼U_{[0,1]} is uniformly distributed on [0,1].

The ordering ≼_{ α } induced by \(D_{+}^{\alpha,inf}\) is given bywhich corresponds to the classical stochastic order restricted to the upper tail.$$Q\preceq_{\alpha} P \Leftrightarrow F^{1}(u)\le G^{1}(u) \quad \forall u\ge \alpha,$$
Proof

Denote by F_{ α } the law of \(\phantom {\dot {i}\!}X_{\alpha }:=X1_{U_{1}\ge \alpha }=X1_{F(X,V)\ge \alpha }\) and by G_{ α } the law of \(\phantom {\dot {i}\!}Y_{\alpha }:=Y1_{U_{2}\ge \alpha }=Y1_{G(Y,V)\ge \alpha }\). Then,Since \(X_{\alpha }=F^{1}(U_{1})1_{U_{1}\ge \alpha }\phantom {\dot {i}\!}\) with U_{1}∼U_{[0,1]}, F_{ α } is the image of the Lebesgue measure on [0,1] induced by the transformation u↦F^{−1}(u)1_{u≥α}. Similarly, G_{ α } is the image of the Lebesgue measure on [0,1] induced by the transformation u↦F^{−1}(u)1_{u≥α}. Therefore, for U∼U_{(0,1)}, the comonotone pair of random variables \(\tilde X_{\alpha }=F^{1}(U)1_{U\ge \alpha }\) and \(\tilde Y_{\alpha }=G^{1}(U)1_{U\ge \alpha }\) is admissible for (F_{ α },G_{ α }).$$ D_{+}^{\alpha,inf}(Q,P)=\inf_{X_{\alpha}\sim F_{\alpha}, Y_{\alpha}\sim G_{\alpha}}. E(X_{\alpha}Y_{\alpha})_{+}$$By submodularity, as in Proposition 6,which implies the result.$$ E(X_{\alpha}Y_{\alpha})_{+}\ge E\left[\left(F^{1}(U)G^{1}(U)\right)_{+}1_{U\ge \alpha}\right], $$

Follows from (48).
Remark 9
5 Weak risk excess measures
5.1 Motivation and definition
Obviously, \(D_{+}^{inf}(Q,P)\le D_{+}^{sup}(Q,P)\).
These considerations motivate the introduction of a weakened notion of risk excess measure, without axiom (A2) and with axiom (A4) restricted to a strict order ≺, i.e., a transitive and irreflexive relation. Therefore, we propose the following definitions:
Definition 8
(Weak risk excess measure) Let ≺ be a strict order on \(\mathcal {M}^{1}(E)\). A onesided weak risk excess measure \(D_{+}^{w}\) on \(\left (\mathcal {M}^{1}(E),\prec \right)\) is an application \(D_{+}^{w}:\mathcal {M}^{1}(E)\times \mathcal {M}^{1}(E)\to \overline {\mathbb {R}}\) which satisfies axioms (A1), (A3), and (A4).
Definition 9
(Maximal extension) Let \(D^{c}_{+}\) be a compound excess risk measure. The maximal extension \(D^{sup}_{+}\) on \(\mathcal {M}^{1}(E)\) of \(D_{+}^{c}\) by mass transportation is given by (49).
Remark 10

The concept of onesided weak risk excess measure is an asymmetric analog of the concept of moment function in the theory of probability metrics, see Rachev (1991) in Chap. 3.3, or Rachev et al. (2013) in Chapters 3.4. and 8.2. In addition, the adjunction of axiom (A4) makes it compatible with a notion of order. Obviously, a onesided risk excess measure for a preorder ≼ is a onesided weak risk excess measure for the strict order ≺ defined by$$P\prec Q \Leftrightarrow P\preceq Q \quad \text{and} P\neq Q.$$

The relation between the minimal \(D_{+}^{inf}\) and maximal \(D_{+}^{sup}\) extensions obtained from a compound risk excess measure \(D_{+}^{c}\), is given in the following improved triangle inequality:where P,Q,R are three probability measures on E, see Rachev et al. (2013) in Theorem 3.4.1.$$ D_{+}^{sup}(Q,R)\le D_{+}^{inf}(Q,P)+D_{+}^{sup}(P,R), $$
where supp(.) denotes the support of a distribution. The analog of Proposition 4 for the maximal extension, which shows that \(D_{+}^{sup}\) is indeed a onesided weak risk excess measure, is given in the following proposition:
Proposition 8
\(D_{+}^{sup}\) obtained in (49) from a compound excess risk measure \(D^{c}_{+}(X,Y)={Ed}_{+}(X,Y)\) of the form (31) is a onesided weak risk excess measure on \((\mathcal {M}^{1}(E),\prec _{sup})\).
Proof
(A1) and (A3) are trivially satisfied. For (A4), if \(D_{+}^{sup}(Q,P)=0\), then for all X∼Q,Y∼P, Ed_{+}(X,Y)=0. Markov’s inequality entails that for all ε>0, d_{+}(X,Y)≤ε a.s. Hence, d_{+}(X,Y)=0 a.s., i.e X≤Y a.s. for all X∼Q,Y∼P. This can only hold if the support of Q is completely to the left of the support of P. The converse direction is trivial: if Q≺_{ sup }P, then for all couplings X∼Q, Y∼P, X≤Y a.s., and thus supX∼Q,Y∼PEd_{+}(X,Y)=0. □
5.2 Dual representation of maximal onesided weak risk excess measure
A dual representation of the maximal onesided weak risk excess measure \(D_{+}^{sup}\) associated with the compound risk excess measure \(D^{c}_{+}(X,Y)={Ed}_{+}(X,Y)\) of the form in (31) is given in the following theorem:
Theorem 2

if d_{+} is upper or lower semicontinuous, then duality holds:where$$D_{+}^{sup}(Q,P)=\inf_{\Psi_{d+}}\left\{ \int f dQ+\int gdP \right\},$$$$\begin{array}{@{}rcl@{}} \Psi_{d_{+}}:=&\{&(f,g)\in Lip^{1}(d_{+})\times Lip^{1}(d_{}), f(x)\ge 0, g(y)\ge 0,\\&& f(x)+g(y)\ge d_{+}(x,y), (x,y)\in E^{2} \}. \end{array} $$

if d_{+} is upper semicontinuous, then the supremum is attained for some probability measure.
Proof

Since a lower or upper semicontinuous function is a supremum or infimum of continuous functions, d_{+} is a Baire function. Hence, the duality Theorem 2.3.8 (a) in Rachev and Rüschendorf (1998) applies, since d_{+}≥0 is obviously majorized from below (i.e., belongs to \(\mathcal P_{m}(S)\) in the notation of Theorem 2.3.8 in Rachev and Rüschendorf (1998)). Therefore, Theorem 2.3.8 (a) entails$$ \sup\left\{\int d_{+}(x,y)\mu(dx,dy)\right\}=\inf\{\int fdQ+\int gdP \}, $$(51)where the infimum on the right side is taken in$$\Psi_{1}:=\{f\in L_{1}(Q), g\in L_{1}(P), d_{+}(x,y)\le f(x)+g(y), (x,y)\in E^{2}\}.$$Let γ_{1},γ_{2} two realvalued constants s.t. γ_{1}+γ_{2}=0 and set for (f,g)∈Ψ_{1}, \((\tilde f:=f\gamma _{1},\tilde g:=g\gamma _{2})\). Then, \((\tilde f, \tilde g)\in \Psi _{1}\) and \(J(f,g)=\int fdQ+\int gdP\) remains invariant when one replaces (f,g) by \((\tilde f, \tilde g)\), i.e., \(J(f,g)=J(\tilde f,\tilde g)\). Therefore, if f takes some negative values, then, setting γ_{1}= inff(x) entails \(\tilde f\ge 0\) and the infimum in (51) can be restricted toBy symmetry, the infimum in (51) can further be restricted to$${} \Psi_{2}:=\{f\in L_{1}(Q), g\in L_{1}(P), f(x)\ge 0, d_{+}(x,y)\le f(x)+g(y), (x,y)\in E^{2}\}. $$$${}\Psi_{3}:= \!\{\!f \!\in\! L_{1}\! (Q), g \!\in\! L_{1}(\! P), f\! (x)\! \ge\! 0, g\! (y)\! \ge\! 0, d_{+}\! (x,y \!)\! \le \!f\! (x)+g (y),\! (x,y)\! \in\! E^{2}\}. $$Assume d_{+} is upper bounded. For (f,g)∈Ψ_{3}, set f_{∗}(y):= supx(d_{+}(x,y)−f(x)) and f_{∗∗}(x):= supy(d_{+}(x,y)−f_{∗}(y)). Then, (f_{∗∗},f_{∗})∈Ψ_{1}, g≥f_{∗}, f≥f_{∗∗}. Hence, J(f,g)≥J(f_{∗∗},f_{∗}). Moreover, by the triangle inequality,$$\begin{array}{@{}rcl@{}} d_{+}(x,y)g^{*}(y)&\le& d_{+}(x,x')+d(x',y)f(y) \end{array} $$and taking the supremum in y yields$$\begin{array}{@{}rcl@{}} f_{**}(x)f_{**}(x')&\le& d_{+}(x,x'). \end{array} $$
Hence, f_{∗∗}∈Lip^{1}(d_{+}), whereas a similarly calculation shows that f_{∗}∈Lip^{1}(d_{−}). Therefore, the infimum in (51) can further be restricted to \(\Psi _{d_{+}}\), as claimed.
The general case, for d_{+} unbounded, proceeds by approximation, as in Theorem 1.

Follows from Theorem 2.3.10 in Rachev and Rüschendorf (1998).
5.3 Examples of maximal extensions
Proposition 9

Let \(D_{+}^{\le,sup}\) be the onesided weak risk excess measure on \((\mathcal {M}^{1}(\mathbb {R}),\prec _{sup})\) obtained by maximal extension of the discrete compound risk measure \(D_{+}^{c}\) in (39). \(D_{+}^{\le,sup}\) has the representation:$$ D_{+}^{\le,sup}(Q,P)=1\sup_{x\in\mathbb{R}^{d}}(F(x)G(x)), $$(52)
where F,G are the c.d.f.s of Q,P, respectively.

The restriction of \(D_{+}^{\le,\sup }\) on E, obtained by setting \(d^<_{+}(x,y):= D_{+}^{\le,\sup }(\delta _{x},\delta _{y})\), defines a weak onesided hemimetric compatible with the strict order <, i.e.,with d+< satisfying axioms (A1), (A3), and (A4) for the strict order < associated with ≤.$$ d^<_{+}(x,y)=1_{x\ge y}, $$
Proof

Note that by Strassen theorem, (see, e.g., Rachev and Rüschendorf (1998) in Theorems 3.5.1 and 3.5.5 or Rüschendorf (1991) in Theorems 4 and 5),$$\begin{array}{@{}rcl@{}} D_{+}^{\le,sup}(Q,P)&=&\sup_{X\sim Q, Y\sim P}\mu(X\nleq Y)=1\inf_{X\sim Q, Y\sim P} \mu(X\le Y)\\ &=&1\sup(Q(B_{1})+P(B_{2})1), \end{array} $$where the supremum is over all pair of subsets B_{1},B_{2}⊂E s.t. B_{1}×B_{2}⊂B:={(x,y);x≤y}. But for B_{1}×B_{2}⊂B, it follows that \(B_{1}^{\downarrow }\times B_{2}^{\uparrow }\subset B\), where \(B_{1}^{\downarrow }=\{x\in \mathbb {R}^{d}:\exists \bar {x}\in B_{1} \text { s.t.} x\le \bar {x}\}\) and \(B_{2}^{\uparrow }=\{y\in \mathbb {R}^{d}:\exists \bar {y}\in B_{2} \text { s.t.} y\ge \bar {y}\}\) are the decreasing, resp. increasing, completions of B_{1},B_{2}. Then, it is easy to see that one can enlarge \(B_{1}^{\downarrow }, B_{2}^{\uparrow }\) to intervals of the form (−∞,x], [x,∞). As a result the maximal extension is given by$$\begin{array}{@{}rcl@{}} D_{+}^{\le,sup}(Q,P)&=&2\sup_{x\in\mathbb{R}^{d}}\{F(x)+\overline G(x)\}\\ &=&1\sup_{x\in\mathbb{R}^{d}}\{F(x)G(x)\}, \end{array} $$
where \(\overline {G}(x)=P([x,\infty))\).

Formula (52) yields$$D_{+}^{\le,sup}(\delta_{x},\delta_{y})=1\sup_{z\in\mathbb{R}^{d}}\{1_{z\ge x}1_{z\ge y}\}=1_{x\ge y}. $$
Remark 11
Next, we investigate the maximal onesided weak risk excess extension for the basic hemimetric (13): on \(E=\mathbb {R}\), for X∼F,Y∼G, let \(D_{+}^{c}(X,Y)=E(XY)_{+}\) be the average risk excess as in (43). The maximal risk excess extension by mass transportation is given by the following proposition.
Proposition 10
where F,G are the c.d.f.s of Q,P, respectively.
Proof
The argument for the maximal risk excess extension is similar to that of the minimal risk excess extension. □
In the previous propositions, the order induced by the maximal extension is very strong. For insurance applications, in particular for comparing tail risk, it is of interest to restrict the comparisons to the upper tails of the distributions, see Proposition 7 in Section 4. Finally, we give the result for the tail excess compound risk measure \(D^{c,\alpha }_{+}(X,Y)\) in (47), which induces a more interesting order:
Proposition 11

Let 0<α<1, then the maximal extension \(D_{+}^{\alpha,sup}\) is given by$$ D_{+}^{\alpha,sup}(Q,P)=(1\alpha)D^{sup}_{+}(Q^{\alpha},P^{\alpha}), $$(54)
where Q^{ α },P^{ α } are the conditional distributions of Q,P on their upper αquantiles intervals [q_{ α }(Q),∞),[q_{ α }(P),∞).

Correspondingly, a suitable consistent ordering ≺_{ α } on \(\mathcal {M}^{1}(\mathbb {R})\) is given bywhere F,G are the c.d.f.s of Q,P. For the maximal extension, the random variables are chosen countermonotonic in the upper part of the distribution.$$ Q\prec_{\alpha} P \Leftrightarrow G^{1}(u)\le F^{1}(1u+\alpha), \text{ for all} \alpha\le u\le 1, $$
Proof
Similar to the proof of Proposition 10. □
6 Extensions with dependence constraints
6.1 Setup
In Sections 4 and 5, we considered risk excess measures D_{+}(Q,P) obtained as minimal and maximal extensions obtained by mass transportation of a compound risk excess measure, i.e., over the class of all dependence structures of (Q,P). In this section, we consider a relevant modification of this method by restricting the class of possible dependence structures. This setup allows to take into consideration some known side information on the dependence structure of (Q,P), like various bounds on positive or negative dependence, see e.g., Rüschendorf (2013) in Chapter 5.
We consider the setup \(E=\mathbb {R}\) with hemimetric d_{+} and the compound excess risk measure \(D_{+}^{c}(X,Y)={Ed}_{+}(X,Y)\) of the kind (6), where \(X,Y\in \mathfrak {X}\) have marginals Q,P. If C=C_{X,Y} is a copula of (X,Y), we also write E_{ C }d_{+}(X,Y) to stress the dependence on C, and we denote by \(\mathcal {C}\) the set of all bivariate copula functions. Let \(\mathcal {D}\subset \mathcal {C}\) denote a subclass of copulas which describe the information on the dependence structure. Then, it is natural to consider the worst and bestcase extension of \(D_{+}^{c}\) over \(\mathcal {D}\).
Definition 10

The minimal extension with dependence restriction \(\mathcal {D}\) of \(D_{+}^{c}\) is defined as$$ D_{+}^{\mathcal{D},inf}(Q,P):=\inf \{E_{C} d_{+}(X,Y), X\sim Q, Y\sim P, C\in\mathcal{D} \}. $$(55)

Similarly, the maximal extension with dependence restriction \(\mathcal {D}\) is defined as$$ D_{+}^{\mathcal{D},sup}(Q,P):=\sup \{E_{C} d_{+}(X,Y),X\sim Q, Y\sim P, C\in\mathcal{D} \}. $$(56)
In the case without dependence restriction, i.e., when \(\mathcal {D}=\mathcal {C}\), we get the minimal and maximal extensions \(D_{+}^{inf}\), \(D_{+}^{sup}\) of (32) and (49) considered in Sections 4 and 5.
Remark 12
By the previous discussion of Section 4 (see Proposition 4), it is clear that \(D_{+}^{\mathcal {D},inf}\) is a risk excess measure on \(\left (\mathcal {M}^{1}(E),\preceq _{st}\right)\) only in case that \(\mathcal {D}\) contains the upper Fréchet bound M, defined by M(u,v)= min(u,v),0≤u,v≤1. So typically the restricted extensions will not satisfy the properties (A2) and (A4) of a onesided risk excess measure on \(\left (\mathcal {M}^{1}(E),\preceq _{st}\right)\).
Despite that, the extensions (55) and (56) have a natural motivation as best, resp., worstcase excess risk taking into account the dependence restrictions. On the level of random variables, the class of pairs (X,Y) with \(C_{XY}\in \mathcal {D}\) and X≤Y may be empty even if Q≼_{ st }P. Therefore, the unrestricted extensions \(D_{+}^{inf}\), resp., \(D_{+}^{sup}\), would under, resp., over estimate the real risk excess. As a consequence, this is a strong indication for the relevance of the notion of minimal, resp., maximal risk excess with dependence restriction \(\mathcal {D}\).
6.2 Explicit results for extensions with positive and negative dependence restriction
the class of all copulas which are smaller than C_{0}, resp., bigger than C_{1}, in the lower orthant ordering ≼_{ lo } (equivalently in the upper orthant ordering ≼_{ uo }). (57) describes a negative dependence restriction, (58) a positive dependence restriction: for the case C_{0}=C_{1}=Π, the independence copula Π(u,v)=uv, 0≤u,v≤1, these restrictions correspond to negatively quadrant dependent (NQD), resp., positively quadrant dependent (PQD), random variables, as defined by Lehmann (1966), see Nelsen (2006) in p. 186.
Then, for d_{+}(x,y)=(x−y)_{+}, we obtain the following explicit result.
Proposition 12

For \(\mathcal {D}=\mathcal {D}_{\le }(C_{0})\), we obtain the explicit formula for the minimal risk excess extension$$ D_{+}^{\mathcal{D},inf}(Q,P)=E_{C_{0}}\left(X^{0}Y^{0}\right)_{+}, $$(59)
where X^{0}∼Q,Y^{0}∼P and \(C_{X^{0},Y^{0}}=C_{0}\phantom {\dot {i}\!}\).

For \(\mathcal {D}=\mathcal {D}_{\ge }(C_{1})\), we obtain the explicit formula for the maximal risk excess extension$$ D_{+}^{\mathcal{D},sup}(Q,P)=E_{C_{1}}\left(X^{1}Y^{1}\right)_{+}, $$(60)
where X^{1}∼Q,Y^{1}∼P and \(\phantom {\dot {i}\!}C_{X^{1},Y^{1}}=C_{1}\).
Proof

For (X,Y) with X∼Q,Y∼P and C_{X,Y}=C≤C_{0}, it follows from the submodularity argument, as in the proof of Proposition 6 thatsince f(x−y)=(x−y)_{+} is submodular and (X,Y)≤_{ sm }(X^{0},Y^{0}), with ≤_{ sm } the supermodular ordering. Taking the infimum yields the result.$$E(XY)_{+}\ge E(X^{0}Y^{0})_{+},$$

The argument is similar.
Remark 13

Taking for \(\mathcal {D}\) the twosided dependence informationwe obtain for \(D_{+}^{\mathcal {D},inf}\) the same formula as in (59) and for \(D_{+}^{\mathcal {D},sup}\) the same formula as in (60). Thus, this information simultaneously shrinks the upper and the lower bound for the risk excess.$$ \mathcal{D}=\mathcal{D}(C_{0},C_{1})=\{C\in \mathcal{C}; C_{1}\le C\le C_{0}\}, $$

The concept of minimal, resp., maximal risk excess can also be introduced for the general case (E,≤) and general compound risk excess measures \(D_{+}^{c}\). In this case, \(\mathcal {D}\) denotes a class of dependence structures of random elements X,Y∈E. Even if \(D_{+}^{inf}\) and \(D_{+}^{sup}\) do not satisfy on the level of distributions the risk excess measure axioms (A2) and (A4), they describe the relevant bounds for the risk excess with dependence information \(\mathcal {D}\).
7 Conclusion
We proposed a quantitative onesided comparison of probabilistic risks via the concept of risk excess measures, obtained as order extensions of hemimetrics on the underlying space E. Like for the case of risk measures, the choice of a suitable hemimetric and corresponding excess risk measure for a particular application will depend on the problem considered and the notion of order one wants to quantify. For reliability, insurance mathematics, finance, epidemiology, etc... different notions of orders and distances are related to the problem at hand. In this regard, the examples proposed, together with their explicit formulas, are helpful. Together with the extension/restriction properties of Section 3, and the dual representations of Sections 4 and 5, they can serve as a guide for the interpretation of the excess risk measure and coherence w.r.t. order and distance on the ambient space E.
Declarations
Competing interests
The authors declare that they have no competing interests.
Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License(http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.
Authors’ Affiliations
References
 Artzner, P, Delbaen, F, Eber, JM, Heath, D: Coherent measures of risk. Math. Finance. 9(3), 203–228 (1999). https://doi.org/10.1111/14679965.00068.
 Berkes, I, Philipp, W: Approximation theorems for independent and weakly dependent random vectors. Ann. Probab. 7(1), 29–54 (1979).MathSciNetView ArticleMATHGoogle Scholar
 Burgert, C, Rüschendorf, L: Consistent risk measures for portfolio vectors. Insurance Math. Econom. 38(2), 289–297 (2006). https://doi.org/10.1016/j.insmatheco.2005.08.008.
 Cambanis, S, Simons, G, Stout, W: Inequalities for E k(X,Y) when the marginals are fixed. Z. Wahrscheinlichkeitstheorie und Verw. Gebiete. 36(4), 285–294 (1976). https://doi.org/10.1007/BF00532695.
 Capéraà, P, Van Cutsem, B: Méthodes et Modèles en Statistique Non Paramétrique. Les Presses de l’Université Laval, SainteFoy, QC; Dunod, Paris (1988). Exposé fondamental. [Basic exposition], With a foreword by Capéraà, Van Cutsem and Alain Baille.MATHGoogle Scholar
 Delbaen, F: Coherent risk measures on general probability spaces. In: Advances in Finance and Stochastics, pp. 1–37. Springer, Berlin (2002).Google Scholar
 Dudley, RM: Distances of probability measures and random variables. Ann. Math. Statist. 39, 1563–1572 (1968). https://doi.org/10.1007/9781441958211_4.
 Dudley, RM: Probabilities and Metrics. Matematisk Institut, Aarhus Universitet, Aarhus (1976). Convergence of laws on metric spaces, with a view to statistical testing, Lecture Notes Series, No. 45.MATHGoogle Scholar
 Dudley, RM: Real Analysis and Probability. Cambridge Studies in Advanced Mathematics, Vol. 74. Cambridge University Press, Cambridge (2002). https://doi.org/10.1017/CBO9780511755347. Revised reprint of the 1989 original.
 Faugeras, OP, Rüschendorf, L: Markov morphisms: a combined copula and mass transportation approach to multivariate quantiles. Math. Applicanda. 45, 3–45 (2017).MathSciNetGoogle Scholar
 Föllmer, H, Schied, A: Stochastic Finance. De Gruyter Studies in Mathematics, Vol. 27. Walter de Gruyter & Co., Berlin (2002). https://doi.org/10.1515/9783110198065. An introduction in discrete time.
 GoubaultLarrecq, J: NonHausdorff Topology and Domain Theory. New Mathematical Monographs, Vol. 22. Cambridge University Press, Cambridge (2013). https://doi.org/10.1017/CBO9781139524438. [On the cover: Selected topics in pointset topology].
 Jouini, E, Meddeb, M, Touzi, N: Vectorvalued coherent risk measures. Finance Stoch. 8(4), 531–552 (2004). https://doi.org/10.1007/s0078000401276.
 Kellerer, HG: Duality theorems for marginal problems. Z. Wahrsch. Verw. Gebiete. 67(4), 399–432 (1984). https://doi.org/10.1007/BF00532047.
 Koenker, R: Quantile Regression. Econometric Society Monographs, Vol. 38. Cambridge University Press, Cambridge (2005). https://doi.org/10.1017/CBO9780511754098.
 Lehmann, EL: Some concepts of dependence. Ann. Math. Statist. 37, 1137–1153 (1966). https://doi.org/10.1214/aoms/1177699260.
 Marshall, AW, Olkin, I, Arnold, BC: Inequalities: Theory of Majorization and Its Applications. 2nd edn. Springer Series in Statistics. Springer (2011). https://doi.org/10.1007/9780387682761.
 Müller, A: Integral probability metrics and their generating classes of functions. Adv. in Appl. Probab. 29(2), 429–443 (1997).MathSciNetView ArticleMATHGoogle Scholar
 Nachbin, L: Topology and Order. Translated from the Portuguese by Lulu Bechtolsheim. Van Nostrand Mathematical Studies, No. 4. D. Van Nostrand Co., Inc., Princeton, N.J.Toronto, Ont.London (1965).Google Scholar
 Nelsen, RB: An Introduction to Copulas. 2nd edn. Springer Series in Statistics. Springer, New York (2006).Google Scholar
 Rachev, ST, Rüschendorf, L.: Approximation of sums by compound Poisson distributions with respect to stoploss distances. Adv. in Appl. Probab. 22(2), 350–374 (1990).MathSciNetView ArticleMATHGoogle Scholar
 Rachev, ST: Probability Metrics and the Stability of Stochastic Models. Wiley Series in Probability and Mathematical Statistics: Applied Probability and Statistics. John Wiley & Sons, Ltd., Chichester (1991).Google Scholar
 Rachev, ST, Rüschendorf, L: Mass Transportation Problems. Vol. I. Probability and its Applications (New York), Vol. 1. SpringerVerlag, New York (1998). Theory.MATHGoogle Scholar
 Rachev, ST, Klebanov, LB, Stoyanov, SV, Fabozzi, FJ: The Methods of Distances in the Theory of Probability and Statistics. Springer (2013). https://doi.org/10.1007/9781461448693.
 Rosenberger, J, Gasko, M: Understanding robust and exploratory data analysis. Wiley Classics Library. WileyInterscience, New York (2000). Chap. Comparing Location Estimators: Trimmed Means, Medians, and Trimean. Revised and updated reprint of the 1983 original.Google Scholar
 Rüschendorf, L.: Monotonicity and unbiasedness of tests via a.s. constructions. Statistics. 17(2), 221–230 (1986). https://doi.org/10.1080/02331888608801931.
 Rüschendorf, L: Fréchet bounds and their applications. In: Dall’Aglio, G, Kotz, S, Salinetti, G (eds.)Advances in Probability Distributions with Given Marginals: Beyond the Copulas, pp. 151–187. Springer, Dordrecht (1991). https://doi.org/10.1007/9789401134668.
 Rüschendorf, L.: On the distributional transform, Sklar’s theorem, and the empirical copula process. J. Statist. Plann. Inference. 139(11), 3921–3927 (2009). https://doi.org/10.1016/j.jspi.2009.05.030.
 Rüschendorf, L.: Mathematical Risk Analysis. Springer Series in Operations Research and Financial Engineering. Springer (2013). https://doi.org/10.1007/9783642335907. Dependence, risk bounds, optimal allocations and portfolios.
 Sriperumbudur, BK, Fukumizu, K, Gretton, A, Schölkopf, B, Lanckriet, GRG: On the empirical estimation of integral probability metrics. Electron. J. Stat. 6, 1550–1599 (2012).MathSciNetView ArticleMATHGoogle Scholar
 Strassen, V: The existence of probability measures with given marginals. Ann. Math. Statist. 36, 423–439 (1965). https://doi.org/10.1214/aoms/1177700153.
 Tchen, AH: Inequalities for distributions with given marginals. Ann. Probab. 8(4), 814–827 (1980).MathSciNetView ArticleMATHGoogle Scholar
 Villani, C: Topics in Optimal Transportation. Graduate Studies in Mathematics, Vol. 58. American Mathematical Society (2003). https://doi.org/10.1007/b12016.
 Zolotarev, VM: Modern Theory of Summation of Random Variables. Modern Probability and Statistics. VSP, Utrecht (1997). https://doi.org/10.1515/9783110936537.