 Research
 Open Access
 Published:
Risk excess measures induced by hemimetrics
Probability, Uncertainty and Quantitative Risk volume 3, Article number: 6 (2018)
Abstract
The main aim of this paper is to introduce the notion of risk excess measure, to analyze its properties, and to describe some basic construction methods. To compare the risk excess of one distribution Q w.r.t. a given risk distribution P, we apply the concept of hemimetrics on the space of probability measures. This view of risk comparison has a natural basis in the extension of orderings and hemimetrics on the underlying space to the level of probability measures. Basic examples of these kind of extensions are induced by mass transportation and by function class induced orderings. Our view towards measuring risk excess adds to the usually considered method to compare risks of Q and P by the values ρ(Q), ρ(P) of a risk measure ρ. We argue that the difference ρ(Q)−ρ(P) neglects relevant aspects of the risk excess which are adequately described by the new notion of risk excess measure. We derive various concrete classes of risk excess measures and discuss corresponding ordering and measure extension properties.
Introduction
Motivation
The evaluation and comparison of risks are basic tasks of risk analysis. For the evaluation of risks, the notion of risk measures—in particular of coherent and convex risk measures—has been introduced in an axiomatic way for real risks in Artzner et al. (1999), Delbaen (2002), Föllmer and Schied (2002) and has been extended to vector risks in Jouini et al. (2004), Burgert and Rüschendorf (2006), and many others. This notion leads to the comparison of two risks X,Y (resp., distributions Q,P) by ρ(X)−ρ(Y) (resp., ρ(P)−ρ(Q)). If the main interest is to compare a risk X to a benchmark risk Y w.r.t. a common risk measure ρ, then the onesided distance
respectively,
is the induced comparison of risks (where x_{+}= max(x,0) denotes the positive part of x).
We argue that the comparisons in (1), (2) neglect some relevant part of measuring the risk excess. This deficit can be seen in the analog simple case where for the basic space \(E=\mathbb {R}^{d}\), the risk of a vector \(\mathbf {x}=(x_{1},\ldots,x_{d})\in \mathbb {R}^{d}\) is measured by the Euclidean norm ρ(x)=x. In this case,
gives a quantitative comparison of the new risk x w.r.t. a benchmark risk y, which is not informative enough. If x=y, then the comparisons in (3) would not take into account whether some or many components of x might be essentially larger than those of y. A better measure for the risk excess would be
Another motivation comes from the fact that some concepts which have an impact on the notion of risk are better defined in a relative manner than in absolute terms: for example, the concept of “heavy tailedness” of a distribution (and the subsequent idea of “tail risk”) is easier to define by comparing the “size of the tail” or “speed of decrease of the density” of the distribution F to the corresponding “size of the tail” or “speed of decrease of the density” of a benchmark distribution G (say, the standard Gaussian one). These comparisons can be operationalized in a quantitative measure of tail risk, e.g., by computing the difference of mass of the distribution F over an αquantile w.r.t. to the corresponding mass for the benchmark distribution G over the same αquantile, viz.,
or, for operationalizing the comparisons of “speed of decrease of the density” by something like,
see, e.g., Capéraà and Van Cutsem (1988) in p. 45, Rosenberger and Gasko (2000). See also the motivation in Section 4.
Outline
In this paper, we propose to measure the risk excess of a risk distribution Q over a given risk distribution P by a hemimetric on the space of probability measures. Hemimetrics are a suitable tool for onesided comparison of risks. When measuring the risk excess of Q compared to P, it is natural to associate a onesided distance
on the space \((\mathcal {M}^{1}(E),\preceq)\) of probability measures, where ≼ is a given stochastic (pre)order ≼ (see the forthcoming definition 3 in Section 2). The stochastic order ≼ is related to the ordering ≤ on the underlying space E. This allows to consider for a quantitative onesided comparison of risks at the level of probability measures as an extension of the order and distance structure on E.
We discuss several classes of risk excess measures D_{+}(Q,P) and consider the question when these are given as order extensions of hemidistances d_{+} on the underlying space E. Several relevant hemidistances are induced by mass transportation and thus give access to natural interpretation. One particular extension is given by a version of the Kantorovich–Rubinstein theorem for hemidistances. The paper develops basic tools and notions for measuring the onesided risk excess of a risk distribution Q compared to P.
The paper is organized as follows: in Section 2, we introduce the notion of hemimetrics which are basic for obtaining a quantitative description of onesided distance in a preordered space (E,≤). The risk excess measure D_{+}(Q,P) of Q w.r.t. P is then introduced as a onesided hemimetric on the space of probability measures \(\mathcal {M}^{1}(E)\). The ordering ≼ on \(\mathcal {M}^{1}(E)\) is chosen consistent with the preorder ≤ on E and describing a positive risk excess, i.e., Q≼P if Q has no positive risk excess w.r.t. P. We discuss several examples to describe the meaning of this notion and the interplay of order and distance.
In Section 3, we study several classes of interesting risk excess comparison measures and corresponding extension properties of the preorderings on the underlying space. A general class of risk comparison measures is introduced by considering worstcase comparison over suitable classes of increasing functions. This is analog to the worstcase representation of convex and coherent risk measures. There are several classes of examples.
In Section 4, we describe risk excess measures D_{+}(X,Y) on the space of random variables. The class of compound risk excess measures is obtained for those measures which depend only on the joint law of the random elements (X,Y). Mass transportation gives a natural way to obtain minimal extensions of compound risk excess measures to risk excess measures in the space of distributions, i.e., which depend only on the marginal laws of X and Y. Dual representations of these risk excess measures are obtained by a version of the Kantorovich–Rubinstein theorem for hemimetrics. Several examples illustrate these constructions.
In Section 5, we introduce the concept of weak risk excess measure, which is a risk excess measure without the weak identity property. Similarly to Section 4, a mass transportation formulation gives a way to obtain weak risk excess measures as the maximal extension of compound risk excess measures. We also give a dual representation of this risk excess measure and introduce several examples of weak excess risk measures constructed from mass transportation problems.
Finally, in Section 6, we consider dependence restrictions on the class of risk pairs (X,Y) and consider maximal and minimal excess risks with these restrictions. These maximal and minimal excess risks do not define risk excess measures, but give relevant and wellmotivated bounds. For one and twosided restrictions, we obtain explicit formulas for the bounds.
Hemimetrics and measuring risk excess
Hemimetrics
As a motivation for the introduction of measuring the risk excess of distributions, one could argue that, from the structural and phenomenological point of view, the concept of risk combines aspects of the metric structure (a risk measure evaluates some “size” or “norm” on the space of distributions) and of the order structure (there is an underlying preorder structure on the space of distributions which allows one to say when one risk is larger than another). Such “quantitative measure of the order” is encapsulated in the notion of hemimetric, see GoubaultLarrecq (2013)in Chap. 6, p. 203. (The terminology is not completely standard and the notion of hemimetric is also known of as pseudo quasimetric in the topology literature, while Nachbin (1965)in p. 61 calls it a semimetric). We use the following definition:
Definition 1
(Hemimetric) A hemimetric or hemidistance d_{+} on a set E is an application \(d_{+}:E\times E\to \overline {\mathbb {R}}\) which satisfies the following axioms: for all x,y,z∈E,

positivity: d_{+}(x,y)≥0;

weak identity: x=y⇒d_{+}(x,y)=0;

triangle inequality: d_{+}(x,z)≤d_{+}(x,y)+d_{+}(y,z).
The main difference with the notion of metric is the omittance of the symmetry condition, and assuming only the weak identity property. For establishing a connection with a preorder ≤ on E, we introduce the notion of a onesided hemimetric.
Definition 2
(Onesided hemimetric) Let d_{+} be a hemimetric on a preordered set (E,≤). Then, d_{+} is called a onesided hemimetric on (E,≤) if

x≤y⇔d_{+}(x,y)=0.
For two comparable elements, the onesided hemimetric of a smaller element x to a larger element y is zero.
Remark 1

If E is a set and d_{+} a hemimetric on E, one can endow E with a preorder structure by setting
$$ x\le y \Leftrightarrow d_{+}(x,y)=0. $$(5)Then, by construction of ≤, we obtain that d_{+} is a onesided hemimetric on E.

Heminorms and hemimetrics:
When E has a vector space structure, a metric d can be induced in a natural way by a norm ρ, as d(x,y):=ρ(x−y). Similarly, a heminorm ρ_{+} on E, (i.e., a subadditive, positive homogeneous, nonnegative functional \(\rho _{+}:E\to \overline {\mathbb {R}}\) satisfying the weak separation condition x=0_{ E }⇒ρ_{+}(x)=0) defines a hemimetric d_{+} by setting
$$ d_{+}(x,y):=\rho_{+}(xy). $$(6)In addition, if E has a preorder ≤ and ρ_{+} is a heminorm which has the property that
$$ x\le 0_{E} \Leftrightarrow \rho_{+}(x)=0, $$(7)then d_{+} in (6) defines a onesided hemimetric.
More generally, if (E,≤,ρ) is a latticeordered normed vector space, one can construct a onesided hemimetric compatible with ≤ by setting
$$ d_{+}(x,y):=\rho((xy)\vee 0_{E}), $$where ∨ is the least upper bound operation.

To any hemimetric d_{+} on E, one can associate its dual hemimetric d_{−}, obtained by symmetrization of d_{+},
$$ d_{}(x,y):=d_{+}(y,x). $$(8)When d_{+} is a onesided hemimetric associated with the order ≤ on E, d_{−} is a onesided hemimetric associated with the corresponding dual order ≥ on E.
A hemimetric d^{+} induces a distance d by symmetrization
$$d^{\infty}(x,y):=\max(d_{+}(x,y),d_{}(x,y)),$$or by taking the positive linear combination, say
$$d^{1}(x,y):=\alpha d_{+}(x,y)+\beta d_{}(x,y), \quad \alpha,\beta >0.$$More generally, a hemimetric allows defining a “onesided” topology by setting the open balls as
$$ B^{+}(x,r):=\{y\in\mathcal X, d_{+}(x,y)<r\}. $$(9) 
The concept of a hemimetric is implicit in several notions encountered in analysis, probability, and statistics. For example, recall that a realvalued function f on a metric space (E,d) is upper semicontinuous at x_{0} iff
$$ \forall \epsilon>0, \exists \delta>0, d(x,x_{0})\le\delta\Rightarrow d_{+}^{b}(f(x),f(x_{0}))\le \epsilon, $$where \(d_{+}^{b}(x,y):=\rho _{+}(xy)=\max (xy,0)\) is the usual basic onesided hemimetric on \((\mathbb {R},\le,.)\) (see Example 3 and (13) below).
Risk excess measures
After the discussion of hemimetrics, we are now in a position to introduce the main object of this paper, which is a measure of the risk excess of a distribution Q w.r.t. P. To that aim, we assume that a preorder ≼ is defined on the set \(\mathcal {M}^{1}(E)\) of probability measures on a measurable space \((E,\mathcal {E})\): P≼Q describes that Q has more risk than P.
Definition 3
(Risk excess measure) A risk excess measure D_{+} is defined as an onesided hemimetric on the preordered space \(\left (\mathcal {M}^{1}(E),{\preceq }\right)\), (or on a subset \(\mathcal {M}\subset \mathcal {M}^{1}(E)\)). D^{+}(Q,P) is called the risk excess of Q w.r.t. P.
We illustrate below this concept with the following examples. A general class of risk excess measure will be presented in a systematic way in Section 3.
Example 1
(Stochastic ordering) On \( E=\mathbb {R}^{d}\), we consider the componentwise order ≤, which is closely connected with the stochastic order ≼_{ st }: for a measurable set B⊂E, define B^{↑}={y∈E; ∃x∈B s.t. y≥x} and say that B is an increasing set if B=B^{↑}. Denote by \(\mathcal {I}(E)\) the set of measurable increasing sets of E.
The stochastic order ≼_{ st } is defined on \(\mathcal {M}^{1}(\mathbb {R}^{d})\) by
for all measurable sets \(B\in \mathcal {I}(E)\). A corresponding risk excess measure is given by
There exists no risk excess of Q w.r.t. P, i.e.,
By the wellknown Strassen theorem (see Strassen (1965) and e.g., Rüschendorf (2013) in Theorem 1.18, p. 22), this is equivalent to the existence of random vectors X∼Q, Y∼P s.t. X≤Y a.s.
In other words, the distribution Q is considered more safe than P if one can construct representations X of Q and Y of P s.t. all coordinates of X are lower than those of Y. Q has a positive risk excess w.r.t. P if some of the components of any representation X of Q exceed the corresponding components of any representation Y of P. Of course, this gives a very strict notion of no risk excess.
Example 2
(Levy–Prokhorov) Let E be a space with a hemimetric d_{+}. Define a “onesided” topology on E by setting the open balls as in (9). Let \(\mathcal {E}\) be the corresponding Borel σ−algebra. For two probability measures \(P,Q\in \mathcal {M}^{1}(E,\mathcal {E})\), define
where A^{ε}:={x∈E:∃a∈A,d_{+}(a,x)<ε}=∪_{x∈A}B^{+}(x,ε). Then, \(D_{+}^{LP}\) is a onesided risk excess measure and \(D_{+}^{LP}(Q,P)=0\) iff Q(A)≤P(A) for all \(A\in \mathcal {E}\).
One can replace A^{ε} by A^{ε]}:={x∈E:∃a∈A,d_{+}(a,x)≤ε}, and the open sets by the closed set in the definition (11), see Dudley (1968), Dudley (1976) in sect. 8, Dudley (2002) in Chap. 11.3. For the onesidedness, if Q(A)≤P(A) for all \(A\in \mathcal {E}\), then, for every ε>0, Q(A)≤P(A)≤P(A^{ε})+ε, since A⊂A^{ε}. Hence, \(D_{+}^{LP}(Q,P)\le \epsilon \). Letting ε↓0 yields \(D_{+}^{LP}(Q,P)=0\). Conversely, if \(D_{+}^{LP}(Q,P)=0\), there exists a sequence ε_{ n }↓0 s.t. for all closed sets A, \(\phantom {\dot {i}\!}Q(A)\le P(A^{\epsilon _{n}})+\epsilon _{n}\). Since \(A^{\epsilon _{n}}\downarrow \overline {A}=A\), this yields Q(A)≤P(A) for all closed sets A. Hence, Q(A)≤P(A) also for all \(A\in \mathcal {E}\).
Examples of hemimetrics
Hemimetrics are suitable tools to measure onesided distances. We illustrate the meaning of this notion and the interplay of order and distance via the following example, which will be used constantly throughout the paper.
Example 3
(Standard examples on (E,≤))

Discrete onesided hemimetric:
Let (E,≤) be a preordered space, then
$$ d_{+}^{\le}(x, y)=\left\{\begin{array}{ll} 0&\text{if} x\le y\\ 1&\text{else} \end{array}\right. $$(12)defines a onesided hemimetric on (E,≤), which we call the discrete onesided hemimetric on (E,≤).

l^{p} hemimetric:
On \(E=\mathbb {R}^{1}\), one can decompose the absolute value into its positive and negative parts x=x^{+}+x^{−}=ρ_{+}(x)−ρ_{+}(−x), viz., into two heminorms satisfying (7). As a consequence of (6), the metric
$$xy=d_{+}(x,y)d_{+}(y,x)=d_{+}(x,y)+d_{}(x,y)$$is decomposed as a sum of two onesided hemimetrics (d_{+},d_{−}) associated with the dual orders (≤,≥). The basic onesided hemimetric
$$ d_{+}^{b}(x,y):=(xy)_{+} $$(13)describes in a quantitative way the ordering relationship ≤. Compared to the discrete hemimetric (12), it also contains information on the magnitude of the onesided departure of two elements.
Similarly on \((E,\le)=(\mathbb {R}^{d},\le)\) supplied with the componentwise (product) order
$$\mathbf{x\le y}\Leftrightarrow x_{i}\le y_{i}, 1\le i\le d,$$the l^{p} heminorms, defined as
$$\begin{array}{@{}rcl@{}} l^{p}_{+}(\mathbf{x})&:=&\left(\sum_{i=1}^{d} \left(x_{i}^{+}\right)^{p}\right)^{1/p}, \quad 1\le p<\infty, \\ l_{+}^{\infty}(\mathbf{x})&:=&\max \left\{x_{i}^{+}\right\} \end{array} $$(14)induce the onesided l^{p} hemimetrics
$$d_{+}^{p}(\mathbf{x, y}):=l^{p}_{+}(\mathbf{xy}), \quad 1\le p\le \infty.$$
Several of the hemimetrics have a direct interpretation and extensions as risk measures for probability distributions. We give two examples:
Example 4

τ−quantiles:
Consider on the real line \(E=\mathbb {R}^{1}\), the heminorm
$$ \rho_{\tau}(x):=\tau x^{+}+(1\tau)x^{}=\tau x^{+}+(1\tau)(x)^{+},\quad 0<\tau <1 $$(15)induces, by Remark 1 and (6), a hemimetric
$$ d_{\tau}(x,y):=\rho_{\tau}(xy). $$(16)It is well known that this hemimetric can be used to define τ−quantiles q_{ τ }(Y) (viz., the Value at Risk) of a random variable Y as a minimizer of E[ρ_{ τ }(Y−y)], i.e.,
$$\begin{array}{@{}rcl@{}} q_{\tau}(Y)&:=&F^{1}_{Y}(\tau)=\arg\inf_{y} E\left[\rho_{\tau}(Yy)\right] \end{array} $$(17)$$\begin{array}{@{}rcl@{}} &=&\arg\inf_{y} E[d_{\tau}(Y,y)]={VaR}_{\tau}(Y), \end{array} $$(18)see Koenker (2005) in p. 5. Note, however, that the order induced by d_{ τ } reduces to the trivial order =, as d_{ τ }(x,y)=0 iff x=y.

Halfspace depth, departure in direction u:
A multivariate generalization of the preceding example can be defined as follows. On \(E=\mathbb {R}^{d}\), we define for any unit vector u an ordering (the length in the direction u), by
$$ \mathbf{x}\le_{\mathbf{u}} \mathbf{y} \Leftrightarrow \mathbf{u}^{T} (\mathbf{y}\mathbf{x})\ge 0, $$(19)where x^{T} denotes the transpose of x. With this ordering,
$$ d_{+}^{\mathbf{u}}(\mathbf{x},\mathbf{y})=\left\{\begin{array}{ll} 1&\text{if} \quad\mathbf{u}^{T} (\mathbf{y}\mathbf{x})> 0\\ 0&\text{else} \end{array}\right. $$(20)defines, as in (12), a onesided hemimetric. It is one if the length of y in direction u is greater than that of x, and is zero else.
This onesided hemimetric has, as basic application, the definition of the halfspace depth function, which describes the degree of outlyingness of a point \(\mathbf {x}\in \mathbb {R}^{d}\) w.r.t. a probability measure P on \(\mathbb {R}^{d}\). It is defined as
$$\begin{array}{@{}rcl@{}} D_{+}(x,P)&:=&\inf_{\mathbf{u}\in S_{d1}}\int d_{+}^{\mathbf{u}}(x,y)dP(y)\\ &=&\inf_{\mathbf{u}\in S_{d1}}\int 1_{\{\mathbf{u}^{T}(\mathbf{y}\mathbf{x})> 0\}}dP(y), \end{array} $$(21)where S_{d−1} is the unit sphere of \(\mathbb {R}^{d}\). Several modifications of this definition are useful to describe a onesided degree of outlyingness (or risk) or quantitative versions of it. Two relevant examples are
$$ D_{+}^{1}(x,P):=\inf_{\mathbf{u}\in S^{+}_{d1}}\int1_{\left\{\mathbf{u}^{T}(\mathbf{y}\mathbf{x})> 0\right\}}dP(y), $$(22)or
$$ D_{+}^{2}(x,P):=\inf_{\mathbf{u}\in S^{+}_{d1}}\int\left(\mathbf{u}^{T}(\mathbf{y}\mathbf{x})\right)^{+}dP(y), $$where \(S_{d1}^{+}=S_{d1}\cap \mathbb {R}^{d,+}\) is the part of the unit sphere in the positive cone x≥0. We mention that a very general approach to multivariate quantiles can be found in Faugeras and Rüschendorf (2017).
At last, we briefly mention some examples of onesided hemimetrics which may appear in related contexts.
Example 5

Schurorder ≤_{ S } on \(\mathbb {R}^{d}\):
The majorization, or Schur order ≤_{ S }, is useful to compare vectors \(\mathbf {x,y}\in \mathbb {R}^{d}\) with identical sums w.r.t. their degree of dispersion, see e.g., Marshall et al. (2011). In a natural way, this ordering extends to an ordering on \(\mathcal {M}^{1}(\mathbb {R}^{d})\), comparing the relative degree of dispersions of two measures. Let \(\mathbf {x, y}\in \mathbb {R}^{d}\), Γ(d) the set of permutations of {1,…,d}. The Schurordering on \(\mathbb {R}^{d}\) x≤_{ S }y is defined by,
$$\begin{array}{@{}rcl@{}} \sum_{k=l}^{d} x_{\gamma(k)}&\le& \sum_{k=l}^{d} y_{\beta(k)}, \quad l=2,\ldots, d,\\ \sum_{k=1}^{d} x_{\gamma(k)}&=& \sum_{k=1}^{d} y_{\beta(k)} \end{array} $$(23)where γ,β∈Γ(d) are the decreasing rearrangements of x and y:
$$ x_{\gamma(1)}\ge x_{\gamma(2)}\ge \ldots\ge x_{\gamma(d)}, \quad y_{\beta(1)}\ge y_{\beta(2)}\ge \ldots\ge y_{\beta(d)}. $$≤_{ S } is a preorder: x≤_{ S }y and y≤_{ S }x only imply that the components of each vector are equal, but not necessarily in the same order. Geometrically, x≤_{ S }y if and only if x is in the convex hull of all vectors obtained by permuting the coordinates of y. When x,y stands for a pair of discrete probability measures on the same set of dpoints, the norming condition (23) is satisfied as the sum is normalized to one.
Say that x and y are Schurcomparable if \( \sum _{i=1}^{n} x_{i}=\sum _{i=1}^{n} y_{i}\). The degree of dispersion is measured by the following onesided hemimetric: for Schurcomparable elements x,y, define
$$d_{+}(\mathbf{x}, \mathbf{y}):=\sup_{l=2,\ldots, d}\left(\sum_{k=l}^{d} [x_{\gamma(k)}y_{\beta(k)}]\right)_{+}.$$One has, for Schurcomparable elements:
$$\mathbf{x}\le_{S} \mathbf{y} \text{ iff} d_{+}(\mathbf{x},\mathbf{y})=0.$$Specialized to discrete probability measures, this gives a onesided hemimetric measuring the degree of dispersion or “variance”.

Onesided Hausdorff hemimetric on closed subsets:
Let (E,d) be a metric space. Set
$$ d_{+}(A,B):=\underset{y\in A\, x\in B}{\sup\inf}\ d(x,y). $$(24)Then, for closed sets A,B, it holds that d_{+}(A,B)=0⇔A⊂B, and d_{+} is a onesided hemimetric on \((\mathcal {C}(E),\subset)\), the set of closed subsets of E.
Risk excess measures induced by function classes
Motivation and definition
For a law invariant, convex risk measure ρ on \(\mathcal {M}^{1}(\mathbb {R}^{d})\), one has a representation of the form
where X∼Q, \(\mathcal {A}\) is a class of scenario measures and α(ν) is a penalization term, see Föllmer and Schied (2002). This representation suggests to consider for a class \(\mathcal {F}\) of real functions on E the following hemimetric
Let \(\mathcal {M}^{\mathcal {F}}:=\{P\in \mathcal {M}^{1}(E):\sup _{f\in \mathcal {F}}\left (\int fdP\right)_{+}<\infty \}\) and define on \(\mathcal {M}^{\mathcal {F}}\) the preorder
Then, \(D^{\mathcal {F}}_{+}\) is a risk excess measure on \(\left (\mathcal {M}^{\mathcal {F}},\preceq _{\mathcal {F}}\right)\).
Another motivation comes from the theory of probability metrics, where some metrics on the space of probability measures are defined by duality from a class of functions: \(D_{+}^{\mathcal {F}}\) in (26) is the natural onesided analog of the probability metrics \(D^{\mathcal {F}}\) induced by a functional class \(\mathcal {F}\),
which go under the name of probability metrics with a ζstructure in Rachev (1991) or integral probability metrics in Müller (1997). We are thus naturally inclined to define:
Definition 4
(\(\mathcal {F}\)induced risk excess measure) The risk excess measure \(D^{\mathcal {F}}_{+}\) on \(\left (\mathcal {M}^{\mathcal {F}},\preceq _{\mathcal {F}}\right)\) defined in (26) is called the \(\mathcal {F}\)induced risk excess measure.
Example 6
Example 1 can be regarded as an \(\mathcal {F}\)induced excess risk measure, by considering \(\mathcal {F}=\{1_{B}:B\in \mathcal {I}(E)\}\).
Remark 2
On a probability space \((\Omega,\mathcal B,\mu)\), let X be a random variable with image measure μ^{X}=Q. By (25), any lawinvariant convex coherent risk measure ρ has a representation of the form \(D^{\mathcal {F}}_{+}(Q,\delta _{0})\) where \(\mathcal {F}=\left \{x\frac {d\nu ^{X}}{d\mu ^{X}}(x), \nu \in \mathcal {A}\right \}\), μ is an underlying measure dominating \(\mathcal {A}\), μ^{X} and ν^{X} the image measures of μ,νby X. Indeed,
So the notion of risk excess measure can be seen as an extension of the notion of risk measures.
Extension and restrictions of orders and hemimetrics
For risk excess measures, an important aspect is to have a kind of consistency w.r.t. some ordering ≤ on E, i.e., \(\mathcal {F}\) consists of increasing functions w.r.t. ≤. In this respect, the following order extension properties are useful.
Proposition 1
(Extension and restriction of order)

If ≼ is a preorder on \(\mathcal {M}^{1}(E)\), then, the relation ≤_{ r }, defined, for x,y∈E, by
$$ x\le_{r} y \Leftrightarrow \delta_{x}\preceq\delta_{y}, $$(28)defines a preorder on E. ≤_{ r } is called the restriction of the preorder ≼ on \(\mathcal {M}^{1}(E)\).

Conversely, if ≤ is a preorder on E, then the stochastic order ≼_{ st } defines a partial order on \(\mathcal {M}^{1}(E)\), such that its restriction ≤_{ r } is identical to ≤.
Proof

The proof follows by direct verification.

By definition, we have
$$\begin{array}{@{}rcl@{}} x\le_{r} y &\Leftrightarrow&\delta_{x}\preceq_{st}\delta_{y} \Leftrightarrow 1_{B}(x)\le 1_{B}(y), \forall B\in\mathcal{I}(E)\\ &\Leftrightarrow& [x\in B\Rightarrow y\in B, \forall B\in\mathcal{I}(E)]. \end{array} $$(29)In particular, restricted to principal upsets B={z}^{↑}, the implication (29) becomes
$$x\ge z\Rightarrow y\ge z, \text{ for all } z\in E,$$which is equivalent to x≤y. Therefore, x≤_{ r }y⇒x≤y. Conversely, if x≤y, (29) is satisfied, by definition of an upset.
□
Remark 3
For a closed partial order ≤ on a Polish space E, the result follows directly from Strassen theorem (see Example 1).
Analogously, we can also extend and restrict in a consistent way the discrete onesided hemimetric \(d_{+}^{\le }\) of Example 3, Eq. (12) into the risk excess measure
of Example 1.
Proposition 2
(Extension and restriction of discrete hemimetrics)

If D_{+} is a risk excess measure on \(\left (\mathcal {M}^{1}(E),\preceq \right)\), then
$$d_{+}^{r}(x,y):=D_{+}(\delta_{x},\delta_{y}) $$defines a onesided hemimetric on (E,≤_{ r }), called the restriction of D_{+} on E.

If \(d_{+}^{\le }\) is the discrete hemimetric on (E,≤) of (12), then \(D_{+}^{st}\) is an extension of \(d_{+}^{\le }\) into a risk excess measure on (M^{1}(E),≼_{ st }) such that the restriction \(d_{+}^{r}\) of \(D_{+}^{st}\) is equal to \(d_{+}^{\le }\).
Proof

The proof follows by direct verification and Proposition 1.

The restriction of \(D_{+}^{st}\) to E writes
$$ d_{+}^{r}(x,y):=D_{+}^{st}(\delta_{x},\delta_{y})=\sup\{\left(1_{B}(x)1_{B}(y)\right)_{+}; B\in\mathcal{I}(E)\}, $$which is {0,1}−valued and a onesided hemimetric on E by Proposition 2 part 1. By Proposition 1 part 2,
$$ d_{+}^{r}(x,y)=0\Leftrightarrow x\le_{r} y \Leftrightarrow x\le y. $$Therefore, \(d_{+}^{r}(x,y)=1_{x\nleq y}=d^{\le }_{+}(x,y).\)
□
Remark 4
The construction of the previous proposition, based on the \(D_{+}^{st}\) of Example 1, which encodes the order ≤ into ≼_{ st }, is consistent w.r.t. the order ≤, in the sense that the restriction of \(D_{+}^{st}\) is the discrete onesided hemimetric \(d_{+}^{r}=d^{\le }_{+}\), which encodes the original order ≤. However, for a onesided hemimetric d_{+} on (E,≤) different from the discrete one, the extention \(D_{+}^{st}\) is in general inconsistent w.r.t. the hemimetric d_{+}, in the sense that the restriction of the risk excess measure \(D_{+}^{st}\) is not the original d_{+} but is again the discrete onesided hemimetric \(d_{+}^{\le }\). This is illustrated in the following diagram:
The question of consistently extending/restricting a onesided hemimetric d_{+} into a risk excess measure D_{+}, according to the diagram,
will be treated by mass transportation in Section 4.
It is interesting to observe that, in general, there may exist many extensions of a onesided hemimetric on E to a risk excess measure on \(\mathcal {M}^{1}(E)\), as seen in the following example. We will discuss some general extensions in Section 4.
Example 7
(Positive orthant ordering) On \(E=\mathbb {R}^{d}\), consider the class \(\mathcal {F}_{uo}\) of upper orthant indicators,
\(\mathcal {F}_{uo}\) induces on \(\mathcal {M}^{1}(E)\) the upper orthant ordering ≼_{ uo } defined by
where \(\overline {F}(\mathbf {z})=Q([\mathbf {z},\infty))\) and \(\overline {G}(\mathbf {z})=P([\mathbf {z},\infty))\) stand for the survival functions of Q and P. So it will be easier for Q to be less risky than P for this order than for the stochastic order, where the comparison has to be made for all increasing sets. The \(\mathcal {F}_{uo}\)induced risk excess measure \(D_{+}^{\mathcal {F}_{uo}}\) is given by
Note that the restriction ≤_{ uo } on \(E=\mathbb {R}^{d}\) of the partial order ≼_{ uo } in the sense of Proposition 1 is identical to the usual componentwise ordering, i.e., ≤_{ uo }=≤. The restriction \(d^{uo}_{+}\) of the risk excess measure \(D_{+}^{{uo}}\) in the sense of Proposition 2 is the discrete onesided hemimetric \(d_{+}^{\le }\) (see Example 3 and (12)):
As a consequence, both risk excess measures \(D_{+}^{{uo}}\) and \(D_{+}^{{st}}\) of Example 1 induce the same componentwise ordering ≤ on \(E=\mathbb {R}^{d}\) and also induce the same restriction as hemimetric on E. \(D_{+}^{uo}\) and \(D_{+}^{st}\) are both extensions of the same discrete onesided hemimetric \(d_{+}^{\le }\) on E from Example 3 (a), as is illustrated in the diagram below:
Example 8
(Increasing convex ordering) On \(E=\mathbb {R}\), consider the class of excess functions \(\mathcal {F}_{icx}:=\{\pi _{t}, t\in \mathbb {R}\}\), with π_{ t }(x):=(x−t)_{+}. Then, on the class of distributions \(\mathcal {M}^{1}_{1}\) with finite first moment, the induced ordering \(\preceq _{\mathcal {F}_{icx}}\) is identical to the increasing convex order,
For X∼Q and Y∼P in \(\mathcal {M}^{1}_{1}\), the generated risk excess measure \(D_{+}^{\mathcal {F}_{icx}}\) is given by
where Π_{ X }(t):=E(X−t)_{+}=Eπ_{ t }(X), Π_{ Y }(t):=E(Y−t)_{+}=Eπ_{ t }(Y) are the mean excess functions. \(D_{+}^{icx}\) measures the risk excess of Q w.r.t. P in terms of the corresponding mean excess functions. When restricted to the class of probability measures with identical first moments, \(\preceq _{\mathcal {F}_{icx}}\) is also identical to the convex ordering,
In this example, the restriction \(D_{+}^{icx}\) of \(D_{+}^{icx}\) is
On the one hand,
On the other hand, if x>y, then \( d_{+}^{icx}(x,y)=\sup _{t\in \mathbb {R}} \left (\pi _{t}(x)\pi _{t}(y)\right) \). By considering all cases, t≤y, y≤t≤x, and x≤t, one sees that the supremum takes the value x−y. Hence, the restriction \(D_{+}^{icx}\) of \(D_{+}^{icx}\) is given by
which is the basic onesided hemimetric of (13).
Risk excess measures for random variables and minimal extension by mass transportation
Compound risk excess measures
So far we have considered risk excess measures as onesided hemimetrics on the space of probability distributions, i.e., as a mapping \(D_{+}:\mathcal {M}\times \mathcal {M}\mapsto [0,\infty ]\), for \(\mathcal {M}\subset \mathcal {M}^{1}(E)\), acting on a pair (Q,P) of probability measures on E. Like for risk measures \(\rho :\mathfrak {X}\mapsto \mathbb {R}\) defined on a space of random variables \(\mathfrak {X}\subset \mathfrak {L}^{0}_{E}=\mathfrak {L}^{0}_{E}(\Omega,\mathcal {A},\mu):=\{X:\Omega \to E\}\) (see e.g., Föllmer and Schied (2002)), it is natural to define risk excess measures \(D_{+}:\mathfrak {X}\times \mathfrak {X}\mapsto \mathbb {R}\), also on a space \(\mathfrak {X}\) of random variables.
This allows to consider the risk of a random element X∈E as a relative property: there is a joint modeling of the vector \((X,Y)\in \mathfrak {X}^{2}\), defined on a common probability space \((\Omega,\mathcal { A}, \mu)\), so that the risk of X:Ω↦E can be considered in relation to the random element Y:Ω↦E, regarded as a benchmark. In the context of insurance and financial mathematics, Y can stand for the value of an alternative portfolio, of a hedge, of a market indicator, or the wealth of an insurer. For example, an insurer, facing the prospect of losing a claim amount X, may wish to evaluate its perceived risk with respect to its reserve capital Y: the ”risk” X does not have the same potential consequences whether Y is small or large compared to X. In the same vein of reasoning, because of the fluctuating and (usually) inflating nature of fiat money in the post1973, petrodollar based, current monetary system, one may be interested in evaluating the value of a financial asset X w.r.t. the price of a commodity Y considered as a standard, like gold or oil, whose supply is limited in essence.
For \(\mathfrak {X}\subset \mathfrak {L}^{0}_{E}=\mathfrak {L}^{0}_{E}(\Omega,\mathcal {A},\mu)\) a set of random variables on \((\Omega,\mathcal {A},\mu)\) with values in (E,≤), we consider the pointwise ordering on \(\mathfrak {X}\) induced by ≤. We identify random elements in \(\mathfrak {L}^{0}_{E}\) which are identical a.s. and similarly X≤Y means that X≤Y μa.s.
Definition 5
(Risk excess measure on \(\mathfrak {X}\)) For \(\mathfrak {X}\subset \mathfrak {L}^{0}_{E}\), a risk excess measure D_{+} on \(\mathfrak {X}\) is a onesided hemimetric on \(\mathfrak {X}\).
Definition 6
(Compound risk excess measure on \(\mathfrak {X}\)) A risk excess measure \(D_{+}^{c}\) on \(\mathfrak {X}\) is called a compound risk excess measure on \(\mathfrak {X}\) if \(D_{+}^{c}(X,Y)\) depends only on the joint distribution μ^{(X,Y)} of (X,Y).
Example 9

An example of a risk excess measure on \(\mathfrak {X}\) which is not compound is
$$ D_{+}(X,Y):=\sup_{\omega\in \Omega}(X(\omega)Y(\omega))_{+}. $$However, since random elements in \(\mathfrak {L}^{0}_{E}\) which are identical μa.s are identified, it is natural to consider only compound risk excess measure, e.g., the essential supremum version
$$ D_{+}(X,Y):=\text{esssup}_{\mu}(XY)_{+} $$instead.

On \((\Omega,\mathcal {A},\mu)\), let \(A_{0}\in \mathcal {A}\), with 0<μ(A_{0})<1, be a class of scenarios considered as “low risk”, while its complement A_{1}:=Ω∖A_{0} is considered as “high risk”. Then, for some safety coefficient α>1,
$$ D_{+}(X,Y):=\text{esssup}_{\mu,A_{0}}(XY)_{+}+{\alpha}\, \text{esssup}_{\mu,A_{1}}(XY)_{+}, $$with \(\text {esssup}_{\mu,A}(XY)_{+}:=\inf \{c\in \mathbb {R};\mu ((XY)_{+}\ge c)\cap A)=0\}\), or
$$ D_{+}(X,Y):=\int_{A_{0}}(XY)_{+}d\mu+\alpha\int_{A_{1}}(XY)_{+}d\mu, $$define noncompound risk excess measures, which values α times more the risk excess (X−Y)_{+} for the high risk scenarios than for the low risk ones.
Remark 5

The notation \(D_{+}^{c}\) in Definition 6 stresses that \(D_{+}^{c}\) depends on the joint distribution μ^{(X,Y)} and not solely on the marginals μ^{X},μ^{Y} of (X,Y), as is the case in Definition 3. See also Zolotarev (1997, Rachev (1991) for the similar notion of compound probability metric. For risk measures ρ(X) on \(\mathfrak X\), there is the analog notion of lawinvariant risk measures which depend only on the law μ^{X} of the random variable.

There are two main reasons why compound risk measures on \(\mathfrak {X}\) are of particular importance. Firstly, they allow to define extensions as excess risk measures \(D_{+}:\mathcal {M}\times \mathcal {M}\to [0,\infty ]\) on subclasses \(\mathcal {M}\subset \mathcal {M}^{1}(E)\) defined by the induced set of distributions of elements of \(\mathfrak {X}\) (see Section 4.3). Secondly, the fact that they depend only on the joint distribution μ^{(X,Y)} induces the possibility of statistical estimation of the risk excess D_{+}(X,Y) by their empirical analogs. This property is most relevant for the application of risk excess measures.

Like in the case of probability metrics, it is also possible to describe compound risk excess measures formally on the subclass \(\mathcal {M}^{(2)}\) of bivariate laws μ^{(X,Y)} for \(X,Y\in \mathfrak {X}\). For details in the case of probability metrics, see Rachev (1991).
Construction of a compound risk excess measure from a onesided hemimetric d _{+} on E
There is a natural way to construct such a compound risk excess measure on a set \(\mathfrak {X}\) of r.v. in (E,≤): let d_{+} be a onesided hemimetric on (E,≤), and let \(\mathfrak {X}\) be the set of random variables X s.t. there exists x,y∈E s.t. Ed_{+}(X,x)<∞ and Ed_{+}(y,X)<∞. The notion of excess risk of Y w.r.t. X is measured by d_{+}(X,Y). The latter can be turned into a deterministic value, e.g., by taking its expectation, so that one obtains a hemimetric on \(\mathfrak {X}\),
Note that (31) depends only on the joint distribution of (X,Y): it is indeed a compound risk excess measure defined on a space \(\mathfrak {X}\) of random variables.
Indeed, one has:
Proposition 3
For any measurable onesided hemimetric d_{+} on (E,≤), (31) defines a finite onesided compound risk excess measure on \(\mathfrak {X}\).
Proof
For all \(X,Y\in \mathfrak {X}\), there exists x,y∈E s.t. Ed_{+}(X,x)<∞ and Ed_{+}(y,Y)<∞. Hence, by the triangle inequality,
Equation (31) is therefore well defined and is obviously a compound risk excess measure. For the onesidedness property, X≤Y a.s. ⇔d_{+}(X,Y)=0 a.s. \(\Leftrightarrow D_{+}^{c}(X,Y)=0\) follows from the onesidedness and nonnegativity of d_{+}. □
Remark 6
Formula (31) gives a natural way to obtain a compound excess risk measure from a onesided hemimetric d_{+} on the ambient space E. Note that not all compound excess risk measures can be written in this form. For example, let (d_{+,i})_{i∈I} be a countable family of onesided hemimetrics on E, then
defines a compound excess risk measure which can not be written as in (31) for some d_{+}.
Minimal extension of a compound risk excess measure
A compound risk excess measure \(D_{+}^{c}\), depending on the joint distribution μ^{(X,Y)}, can be turned by mass transportation into a risk excess measure on \(\mathcal {M}^{1}(E)\), i.e., depending only on the pair of marginals μ^{X},μ^{Y}, where \(\mathcal {M}^{1}(E)\) is supplied with the stochastic ordering ≼_{ st } consistent with the underlying order ≤ on \(\mathfrak {X}\).
Definition 7
Let \(D_{+}^{c}\) be a compound excess risk excess measure. The minimal extension \(D^{inf}_{+}\) on \(\mathcal {M}^{1}(E)\) of \(D_{+}^{c}\) by mass transportation is given by
The fact that \(D_{+}^{inf}\) is indeed a onesided risk excess measure on the space of probability measures is given in the following proposition:
Proposition 4

If (E,≤) is a Polish space with a closed partial order, and if \(D_{+}^{c}\) is weakly lowersemicontinuous, in the sense that
$$ (X_{n},Y_{n})\stackrel{d}{\to}(X,Y)\Rightarrow D_{+}^{c}(X,Y)\le \liminf D_{+}^{c}(X_{n},Y_{n}), $$(33)then \(D^{inf}_{+}\) is a onesided risk excess measure on \((\mathcal {M}^{1}(E),\preceq _{st})\), where ≼_{ st } is the stochastic order.

If \(D_{+}^{c}(X,Y)={Ed}_{+}(X,Y)\), as in (31), for d_{+} a lower semi continuous onesided hemimetric on (E,≤), then \(D^{inf}_{+}\) is a onesided risk excess measure on \(\left (\mathcal {M}^{1}(E),\preceq _{st}\right)\).
Proof

(A1) is obvious. (A2) follows from the fact that \(D_{+}^{c}\) satisfies (A2): for X∼Q, \(0 \le D^{inf}_{+}(Q,Q)\le D^{c}_{+}(X,X)=0\). Regarding (A3): for \((\Omega,\mathcal {A},\mu)\) a nonatomic probability space and E a Polish space, any bivariate measure \(\alpha \in \mathcal {M}^{1}(E^{2})\) can be obtained as the image measure of μ by some measurable mapping, see e.g., Berkes and Philipp (1979). Therefore, for all ε>0, there exists random variables (X,Y_{1})∼α=α_{ QP }, where \(\alpha \in \mathcal {M}^{1}\left (E^{2}\right)\) has marginals Q,P and there exists random variables (Y_{2},Z)∼β=β_{ PR } with marginals P,R s.t.
$$ D_{+}^{inf}(Q,P)+\frac{\epsilon}{2}\ge D_{+}^{c}(X,Y_{1}),\quad \text{and }D_{+}^{inf}(P,R)+\frac{\epsilon}{2}\ge D_{+}^{c}(Y_{2},Z). $$By the gluing lemma, see e.g., Villani (2003) in p. 208, there exists a trivariate measure γ=γ_{ QPR } s.t. its projection on the first two marginals is α and its projection on the last two marginals is β. In addition, γ can be obtained as the image measure of μ for some measurable mapping. In other words, there exists a joint construction of a random vector \((\tilde X,\tilde Y,\tilde Z)\) on the probability space \((\Omega,\mathcal {A},\mu)\) s.t. \(\mu ^{\tilde X,\tilde Y,\tilde Z}=\gamma \) and
$$ D_{+}^{inf}(Q,P)+\frac{\epsilon}{2}\ge D_{+}^{c}\left(\mu^{\tilde X,\tilde Y}\right),\quad \text{and }D_{+}^{inf}(P,R)+\frac{\epsilon}{2}\ge D_{+}^{c}\left(\mu^{\tilde Y,\tilde Z}\right). $$(34)By (A3) for the compound risk excess \(D_{+}^{c}\),
$$ D_{+}^{c}\left(\mu^{\tilde X\tilde Z}\right)\le D_{+}^{c}\left(\mu^{\tilde X\tilde Y}\right)+D_{+}^{c}\left(\mu^{\tilde Y\tilde Z}\right) $$which gives with (34),
$$ D_{+}^{inf}(Q,R)\le D_{+}^{c}\left(\mu^{\tilde X\tilde Z}\right)\le D_{+}^{inf}(Q,P)+D_{+}^{inf}(P,R)+ \epsilon. $$Letting ε↓0 gives (A3) for \(D_{+}^{inf}\).
For the onesidedness property (A4), if \(D^{inf}_{+}(Q,P)=0\), then there exists a sequence (X_{ n },Y_{ n }) of random variables on \((\Omega,\mathcal {A}, \mu)\), all with fixed marginals Q,P, s.t. \(D_{+}^{c}(X_{n}, Y_{n})\to 0\). Since \(\mathcal {M}^{1}(Q,P)\) the set of probability measures on E×E with marginals Q,P is weakly compact in \(\mathcal {M}^{1}\left (E^{2}\right)\), one can extract a subsequence n^{′} s.t. \(\phantom {\dot {i}\!}(X_{n'},Y_{n'})\stackrel {d}{\to }(X,Y)\) for some (X,Y) with marginals Q,P. By the assumption on \(D_{+}^{c}\),
$$ D_{+}^{c}(X,Y) \le \liminf D_{+}^{c}(X_{n},Y_{n})=0 $$which entails X≤Y, μa.s. by (A4’). The latter is equivalent to Q≼_{ st }P by Strassen theorem (see Theorem 1.18 in Rüschendorf (2013)). The converse is obvious.

If (X_{ n },Y_{ n })→d(X,Y), by Skorohod’s representation theorem, there exists \((\tilde X_{n},\tilde Y_{n})\stackrel {a.s.}{\to }(\tilde X,\tilde Y)\), with \((\tilde X_{n},\tilde Y_{n})\stackrel {d}{=}(X_{n}, Y_{n})\), \((\tilde X,\tilde Y)\stackrel {d}{=}(X, Y)\). Therefore, lower semicontinuity of d_{+} and Fatou’s lemma entails,
$$\begin{array}{@{}rcl@{}} D_{+}^{c}(X, Y)&=&{Ed}_{+}(\tilde X,\tilde Y)\le E[\liminf d_{+}(\tilde X_{n},\tilde Y_{n})]\\ &\le& \liminf {Ed}_{+}(\tilde X_{n},\tilde Y_{n}) =\liminf D_{+}^{c}(X_{n},Y_{n}), \end{array} $$i.e., (33) is satisfied.
□
Dual representations of minimal extensions
Define L^{1}:=L^{1}({P,Q}) as the set of functions \(f:E\to \mathbb {R}\) integrable w.r.t. P and Q, C_{ b } as the set of bounded continuous functions \(f:E\to \mathbb {R}\), and Lip^{1}=Lip^{1}(E,d_{+}) as the set of 1Lipschitz functions \(f:E\to \mathbb {R}\) w.r.t. d_{+}, i.e., s.t. for all x,y∈E,
holds. Note that for f∈Lip^{1}(E,d_{+}) and y≤x, we have f(y)−f(x)≤d_{+}(y,x)=0, i.e., f is increasing w.r.t. the order induced by d_{+} on E. Hence, Lip^{1}(E,d_{+}) is a subset of the set of increasing functions.
For a compound excess risk measure \(D_{+}^{c}\) of the kind in (31), the minimal extension \(D_{+}^{inf}\) on \(\mathcal {M}^{1}(E)\) of \(D_{+}^{c}\) by mass transportation, as in (32), admits a representation as a \(\mathcal {F}\)induced risk excess measure, as in (26), which is given by the following Kantorovich–Rubinsteintype theorem for hemimetrics:
Theorem 1
(Kantorovich–Rubinstein theorem for minimal risk excess measure) On a Polish space E, supplied with a closed order ≤, and a lower semicontinuous onesided hemimetric d_{+}, the minimal extension \(D_{+}^{inf}\) of the compound risk excess measure \(D^{c}_{+}(X,Y)={Ed}_{+}(X,Y)\) has the dual form
In other words, \(D^{inf}_{+}\) is identical to a \(\mathcal {F}\)induced risk excess measure \(D_{+}^{\mathcal {F}}\)of (26), with \(\mathcal {F}=Lip^{1}_{b}\), the class of bounded Lipschitz functions w.r.t. d_{+}.
Proof
The proof is similar to the method used to prove the Kantorovich–Rubinstein theorem for metric spaces, see e.g., Rachev and Rüschendorf (1998),Villani (2003), with some slight modifications. Let \(\mathcal {M}^{1}(Q,P)\) be the set of probability measures π on E×E with marginals Q,P. For (f,g)∈L_{1}(Q)×L_{1}(P), set
Let
and \(\mathcal {C}_{b}^{2}\) be the set of pairs of realvalued functions (f,g) which are continuous and bounded. Set

Step one: One has the easy inequality,
$$ D_{+}^{Lip^{1}\cap L^{1}}(Q,P)\le D_{+}^{inf}(Q,P). $$(37)Indeed, for all f∈Lip^{1}(d_{+})∩L^{1} and \(\pi \in \mathcal {M}(Q,P)\),
$$\begin{array}{@{}rcl@{}} \left(\int f(x)Q(dx)\int f(y)P(dy)\right)_{+}&=&\left(\int (f(x)f(y))\pi(dx,dy)\right)_{+}\\ &\le& \int d_{+}(x,y) \pi(dx,dy). \end{array} $$Taking the inf on the right and the sup on the left entails the stated inequality (37).

Step two: Kantorovich’s duality, \(D^{inf}_{+}(Q,P)= S(Q,P)=\sup _{\Phi _{d_{+}}}J(f,g)\).
Since d_{+}≥0 is l.s.c., this follows from Rachev and Rüschendorf (1998) in Theorem 2.3.1 (b) or Villani (2003) in Theorem 1.3.

Step three: in view of the first two steps, it remains to show that
$$D_{+}^{Lip^{1}\cap L_{1}(Q)}(Q,P)\ge D_{+}^{inf}(Q,P),\vspace*{2pt}$$i.e., that
$$\sup_{f\in Lip^{1}\cap L_{1}(Q)} \left(\int fd(QP)\right)_{+}\ge \sup_{\Phi_{d_{+}}}J(f,g).$$Assume that d_{+} is bounded.
For f continuous bounded, define the d_{+}− convex conjugate of f by
$$ f^{*}(y):=\inf_{x\in E}\{d_{+}(x,y)f(x)\}. $$One obviously has f(x)+f^{∗}(y)≤d_{+}(x,y), for all x,y∈E. Therefore, if x↦d_{+}(x,y) is bounded l.s.c. and f∈C_{ b }, then f^{∗} is well defined and bounded.
Moreover, by the triangle inequality, one also has
$$d_{+}(x,y)f(x)\le d_{+}(x,y')+d_{+}(y',y)f(x).$$Taking the infimum on x on both sides yields
$$ f^{*}(y)f^{*}(y')\le d_{+}(y',y)=d_{}(y,y'), $$where d_{−} is the opposite dual hemimetric defined in (8): f^{∗} is d_{−}Lipschitz.
Note that if f(x)+g(y)≤d_{+}(x,y) for all x,y, then f^{∗}(y)≥g(y).
Define the double conjugate by
$$\begin{array}{@{}rcl@{}} f^{**}(x)&:=&\inf_{y\in E}\{d_{+}(x,y)f^{*}(y)\}. \end{array} $$One has f^{∗∗}(x)≥f(x): by definition,
$$\begin{array}{@{}rcl@{}} f^{**}(x)&=&\inf_{y\in E}\sup_{x'}\left\{d_{+}(x,y)d_{+}(x',y)+f(x')\right\}\\ &\ge& f(x), \end{array} $$by taking x=x^{′} in the last equation.
Moreover, f^{∗∗} is this time d_{+}Lipschitz: the triangle inequality d_{+}(x,y)−f^{∗}(y)≤d_{+}(x,x^{′})+d_{+}(x^{′},y)−f^{∗}(y) yields, by taking the infimum on y, f^{∗∗}(x)−f^{∗∗}(x^{′})≤d_{+}(x,x^{′}).
We obtain: f^{∗∗}(x)= infy{d_{+}(x,y)−f^{∗}(y)}≤−f^{∗}(x) by taking y=x. On the other hand, since f^{∗} is 1Lipschitz w.r.t. d_{−}, one has
$$^{*}(x)\le d_{+}(x,y){f*}(y),$$which yields −f^{∗}(x)≤f^{∗∗}(x). Hence, f^{∗∗}=−f^{∗}.
Denoting ϕ:=−f^{∗}, and since f^{∗} is d_{−}Lipschitz, ϕ is d_{+}Lipschitz (and bounded thus integrable). In view of all of the above, \((f,g)\in \Phi _{d_{+}}\cap \mathcal {C}_{b}^{2}\) implies \((f^{**},f^{*})\in \Phi _{d_{+}}\) and J(f,g)≤J(f^{∗∗},f^{∗})=J(ϕ,−ϕ). Hence,
$$ \sup_{\Phi_{d_{+}}\cap \mathcal{C}_{b}^{2}} J(f,g)\le \sup_{\phi\in Lip^{1}\cap L_{1}(Q)} J(\phi,\phi)\le \sup_{\phi\in Lip^{1}\cap L_{1}(Q)} \left(\int \phi d(QP)\right)_{+}, $$(38)which had to be proved.
Combining (37) with (38), yields the desired result for the case of a bounded hemimetric d_{+}.

Step 4: One can remove the assumption that d_{+} is bounded. For d_{+} a general l.s.c. hemimetric, one can reason as in Villani (2003) in Theorem 1.3, step 3 with \(d^{n}_{+}=d_{+}/(1+n^{1}d_{+})\), so that \(0\le d_{+}^{n}\le d_{+}\) and \(d^{n}_{+}\uparrow d_{+}\) pointwise.
□
Remark 7
The dual formulation of Theorem 1 gives another proof of the second part of Proposition 4, since the set of increasing bounded Lipschitz functions generates the stochastic order (see the argument in Example 8).
Examples of minimal risk excess measures
The following propositions give explicit representations of the minimal risk excess measure for several hemimetrics. We first consider the discrete hemimetric \(d_{+}^{\le }\):
Proposition 5
(Minimal risk excess measure arising from the stochastic order)

Let \(E=\mathbb {R}^{d}\) be supplied with the (closed) componentwise order ≤. The discrete hemimetric \(d_{+}^{\le }\) of (12) generates, via Proposition 3, the compound risk excess measure
$$ D_{+}^{c}(X, Y)=\mu(X\nleq Y). $$(39)This induces, as minimal extension by mass transportation on \(\mathcal {M}^{1}(\mathbb {R}^{d})\), the stochastic ordering onesided risk excess measure of (10):
$$ D_{+}^{inf}(Q,P)=D_{+}^{st}(Q,P). $$(40) 
A dual representation of (40) is given by
$$ D_{+}^{inf}(Q,P)=\sup_{f\uparrow, 0\le f\le 1}\left(\int f d(QP)\right)_{+}. $$(41)
Proof

Since ≤ is a closed order, \(C:=\{(x,y)\in E\times E, x\nleq y\}\) is an open set and \(d_{+}^{\le }(x,y)=1_{C}(x,y)\) is a {0,1}valued l.s.c. function. By Kellerer (1984) and Rüschendorf (1986) in Lemma 1, (see also Villani (2003)) in Theorem 1.27,
$$ D_{+}^{inf}(Q,P)=\sup\left\{Q(A)P\left(A^{C}\right), A\subset E, A \text{ closed}\right\}, $$where A^{C}:={y∈E,∃x∈A,(x,y)∉C}={y∈E,∃x∈A,x≤y}=A^{↑}. Since A⊂A^{↑},
$$\begin{array}{@{}rcl@{}} D_{+}^{inf}(Q,P)&=&\sup\left\{Q(A)P\left(A^{\uparrow}\right), A\subset E, A \text{ closed}\right\}\\ &=&\sup\left\{(Q(A)P(A))_{+}, A\in \mathcal{I}(E), A \text{ closed}\right\}=D_{+}^{st}(Q,P).\vspace*{2pt} \end{array} $$ 
By Kantorovich–Rubinstein Theorem 1,
$$\begin{array}{@{}rcl@{}} D_{+}^{inf}(Q,P)&=&\sup_{f\in Lip^{1}(\mathbb{R}^{d},d_{+})}\left(\int f d(QP)\right)_{+}\\ &=&\sup_{f\uparrow, 0\le f\le 1}\left(\int f d(QP)\right)_{+}. \end{array} $$(42)Note that one can restrict to the set of increasing functions such that 0≤f≤1 by shifting the function by a constant.
□
Next, we consider, for \(E=\mathbb {R}\), the basic onesided hemimetric \(d_{+}^{b}(x,y)=(xy)_{+}\), introduced in (13), describing the magnitude of onesided departure in a quantitative way. For \(\mathfrak {X}=L^{1}(\mu)\) the set of random variables on \((\Omega,\mathcal {A},\mu)\) with finite first moment, d_{+} induces the compound onesided risk excess measure
on \(\mathfrak {X}\). The corresponding minimal risk excess is given in the following result:
Proposition 6
(Minimal risk excess arising from mean exceedance)

The minimal extension of (43) to a risk excess measure on \(\mathcal {M}^{1}(\mathbb {R})\) by mass transportation is given by
$$\begin{array}{@{}rcl@{}} D_{+}^{inf}(Q,P)&=&\inf_{X\sim Q, Y\sim P}E(XY)_{+}\\ &=&\sup_{f\in Lip^{1}, f\uparrow} \left(\int fd(QP)\right)_{+}=D_{+}^{Lip^{1,\uparrow}}(Q,P), \end{array} $$where Lip^{1,↑} the class of increasing, 1Lipschitz functions (w.r.t. .).
The ordering induced by \(D_{+}^{inf}\) on \(\mathcal {M}^{1}(\mathbb {R})\) is the stochastic order ≼_{ st }.

One has the following explicit representation:
$$ D_{+}^{inf}(Q,P)=E\left(F^{1}(U)G^{1}(U)\right)_{+}, $$(44)where F,G are the distribution functions of Q,P, and U∼U_{[0,1]} is uniformly distributed on [0,1].
Proof

With the assumption on \(\mathfrak {X}\), Kantorovich–Rubinstein Theorem 1 specializes to
$$\begin{array}{@{}rcl@{}} D_{+}^{inf}(Q,P) &=&\sup_{f\in Lip^{1}\left(\mathbb{R}, d_{+}^{b}\right)} \left(\int fd(QP)\right)_{+}. \end{array} $$(45)Note that \(f\in Lip^{1}\left (\mathbb {R},d_{+}^{b}\right)\) is equivalent to f(y)−f(x)≤(y−x)_{+}, i.e., f increasing and 1Lipschitz w.r.t. the absolute value . norm.
The fact that the order induced by \(D_{+}^{inf}\) on \(\mathcal {M}^{1}(\mathbb {R})\) is the stochastic order ≼_{ st } follows from Proposition 4. Alternatively, a direct proof is as follows: let n≥1 be a positive integer, X∼Q, Y∼P. By Markov’s inequality,
$$P(XY\ge n^{1})\le P\left((XY)_{+}\ge n^{1}\right)\le nE[(XY)_{+}].$$Taking the infimum over X∼Q,Y∼P yields that \(D_{+}^{inf}(Q,P)=0\) implies that X−Y<n^{−1} with probability one. Letting n→∞ yields X≤Y a.s. Hence,
$$D_{+}^{inf}(Q,P)=0 \text{ iff there exists }X\sim Q, Y\sim P \text{ s.t.} X\le Y \text{ a.s.}$$and the latter is equivalent to Q≼_{ st }P, by Strassen theorem.

f(x)=x_{+} is convex, hence f(x−y) is submodular (or quasiantitone in the terminology of Cambanis et al. (1976), or supernegative or 2negative in the terminology of Tchen (1980)). This implies (44) by results of Cambanis et al. (1976) in Theorem 2, or Tchen (1980) in Corollary 2.3 (see also Rüschendorf (2013)).
□
Remark 8
(Comparison with the stoploss metric) Note that for \(t\in \mathbb {R}\), the compound onesided risk excess measure \(D^{c}_{+}(X,t)=E(Xt)_{+}=\Pi _{X}(t) \) is the average risk excess over the threshold t, which stands for the stoploss premium of a reinsurer in insurance theory. Rachev and Rüschendorf (1990) consider the stop loss metric as the difference of two stop loss premiums, which would write with our conventions of notations (see Eq. (2.2) in Rachev and Rüschendorf (1990)) as,
One could obtain from it the corresponding hemimetric which was introduced in (30), in relation to the increasing convex order,
which is distinct from the minimal risk excess \(D_{+}^{inf}\). This follows from the triangle inequality for (X−t)_{+}:
and taking the infimum yields that
In other words, the hemimetric obtained by a onesided comparison of risks through their stoploss premiums is always majorized by the minimal risk excess. See also remark 9 for similar considerations for the tail risk.
In risk theory, it is also of interest to compare the expected risks above their distributional αquantiles: this is the basis for the conditional tail expectation
where q_{ α }(X), q_{ α }(Y) denote the corresponding α−quantiles of X∼Q with c.d.f. F, Y∼P, with c.d.f. G. In order to obtain a coherent risk measure and to generalize to possibly noncontinuous distributions (see Burgert and Rüschendorf (2006)), it is useful to instead consider the expected shortfall. Define, for λ∈[0,1], the extended c.d.f.s of F, G as
Define also the distributional transforms of X and Y as
where V∼U_{(0,1)} is independent of (X,Y), see Rüschendorf (2009). The expected shortfalls are then defined as ES_{ α }(X):=E[XU_{1}≥α], respectively as ES_{ α }(Y):=E[YU_{2}≥α].
For the onesided comparison of the risk excess of X w.r.t. Y over their αquantiles, we therefore consider the excess risk of their expected shortfall defined by the following onesided compound risk excess measure \(D_{+}^{\alpha,c} (X,Y)\)
where U_{1}, U_{2} are as in (46). We obtain the following result:
Proposition 7
(Minimal tail risk excess)

The minimal extension of (47) to a risk excess measure on \(\mathcal {M}^{1}(\mathbb {R})\) by mass transportation has the representation
$$\begin{array}{@{}rcl@{}} D_{+}^{\alpha,inf}(Q,P)&:=&\inf_{X\sim Q, Y\sim P} {ED}_{+}^{\alpha,c}(X,Y)\\ &=&E\left[\left(F^{1}(U)G^{1}(U)\right)_{+}1_{U\ge \alpha}\right], \end{array} $$(48)where U∼U_{[0,1]} is uniformly distributed on [0,1].

The ordering ≼_{ α } induced by \(D_{+}^{\alpha,inf}\) is given by
$$Q\preceq_{\alpha} P \Leftrightarrow F^{1}(u)\le G^{1}(u) \quad \forall u\ge \alpha,$$which corresponds to the classical stochastic order restricted to the upper tail.
Proof

Denote by F_{ α } the law of \(\phantom {\dot {i}\!}X_{\alpha }:=X1_{U_{1}\ge \alpha }=X1_{F(X,V)\ge \alpha }\) and by G_{ α } the law of \(\phantom {\dot {i}\!}Y_{\alpha }:=Y1_{U_{2}\ge \alpha }=Y1_{G(Y,V)\ge \alpha }\). Then,
$$ D_{+}^{\alpha,inf}(Q,P)=\inf_{X_{\alpha}\sim F_{\alpha}, Y_{\alpha}\sim G_{\alpha}}. E(X_{\alpha}Y_{\alpha})_{+}$$Since \(X_{\alpha }=F^{1}(U_{1})1_{U_{1}\ge \alpha }\phantom {\dot {i}\!}\) with U_{1}∼U_{[0,1]}, F_{ α } is the image of the Lebesgue measure on [0,1] induced by the transformation u↦F^{−1}(u)1_{u≥α}. Similarly, G_{ α } is the image of the Lebesgue measure on [0,1] induced by the transformation u↦F^{−1}(u)1_{u≥α}. Therefore, for U∼U_{(0,1)}, the comonotone pair of random variables \(\tilde X_{\alpha }=F^{1}(U)1_{U\ge \alpha }\) and \(\tilde Y_{\alpha }=G^{1}(U)1_{U\ge \alpha }\) is admissible for (F_{ α },G_{ α }).
By submodularity, as in Proposition 6,
$$ E(X_{\alpha}Y_{\alpha})_{+}\ge E\left[\left(F^{1}(U)G^{1}(U)\right)_{+}1_{U\ge \alpha}\right], $$which implies the result.

Follows from (48).
□
Remark 9
It is interesting to note that the expected shortfall of X is given by
As expected, the minimal extension risk excess measure dominates the normalized onesided difference of expected shortfalls:
where Y∼P,X∼Q.
Weak risk excess measures
Motivation and definition
In view of the mass transportation approach of (32), one may inquire whether there exist other schemes of obtaining a risk excess measure D_{+}(Q,P), in the sense of Definition 3, from a compound risk excess measure \(D^{c}_{+}(X,Y)\), in the sense of Definition 6. In particular, it is natural to investigate the following “maximal extension” in the sense of mass transportation,
Obviously, \(D_{+}^{inf}(Q,P)\le D_{+}^{sup}(Q,P)\).
However, \(D_{+}^{sup}\) is not a risk excess measure: although (A1) and (A3) are obviously satisfied, (A2) is not. Indeed,
This implies that X≤Y a.s. for all possible realizations X∼Q,Y∼Q. But for X,Y independent with the same law Q, this would require that X≤Y a.s. which is only true for Q being a onepoint distribution. These considerations imply that \(D_{+}^{sup}\) can not be compatible with a reflexive order relation: axiom (A4) can not be satisfied either.
Nonetheless, \(D_{+}^{sup}\), as a supremum over all joint constructions of (X,Y)∼(Q,P), gives the best possible upper bound on the compound risk excess measure in the sense of mass transportation,
and therefore has a natural interpretation as a worstcase comparison, which is appealing for risk applications.
These considerations motivate the introduction of a weakened notion of risk excess measure, without axiom (A2) and with axiom (A4) restricted to a strict order ≺, i.e., a transitive and irreflexive relation. Therefore, we propose the following definitions:
Definition 8
(Weak risk excess measure) Let ≺ be a strict order on \(\mathcal {M}^{1}(E)\). A onesided weak risk excess measure \(D_{+}^{w}\) on \(\left (\mathcal {M}^{1}(E),\prec \right)\) is an application \(D_{+}^{w}:\mathcal {M}^{1}(E)\times \mathcal {M}^{1}(E)\to \overline {\mathbb {R}}\) which satisfies axioms (A1), (A3), and (A4).
Definition 9
(Maximal extension) Let \(D^{c}_{+}\) be a compound excess risk measure. The maximal extension \(D^{sup}_{+}\) on \(\mathcal {M}^{1}(E)\) of \(D_{+}^{c}\) by mass transportation is given by (49).
Remark 10

The concept of onesided weak risk excess measure is an asymmetric analog of the concept of moment function in the theory of probability metrics, see Rachev (1991) in Chap. 3.3, or Rachev et al. (2013) in Chapters 3.4. and 8.2. In addition, the adjunction of axiom (A4) makes it compatible with a notion of order. Obviously, a onesided risk excess measure for a preorder ≼ is a onesided weak risk excess measure for the strict order ≺ defined by
$$P\prec Q \Leftrightarrow P\preceq Q \quad \text{and} P\neq Q.$$ 
The relation between the minimal \(D_{+}^{inf}\) and maximal \(D_{+}^{sup}\) extensions obtained from a compound risk excess measure \(D_{+}^{c}\), is given in the following improved triangle inequality:
$$ D_{+}^{sup}(Q,R)\le D_{+}^{inf}(Q,P)+D_{+}^{sup}(P,R), $$where P,Q,R are three probability measures on E, see Rachev et al. (2013) in Theorem 3.4.1.
Define on \(\mathcal {M}^{1}(E)\) the following strict order ≺_{ sup } by
where supp(.) denotes the support of a distribution. The analog of Proposition 4 for the maximal extension, which shows that \(D_{+}^{sup}\) is indeed a onesided weak risk excess measure, is given in the following proposition:
Proposition 8
\(D_{+}^{sup}\) obtained in (49) from a compound excess risk measure \(D^{c}_{+}(X,Y)={Ed}_{+}(X,Y)\) of the form (31) is a onesided weak risk excess measure on \((\mathcal {M}^{1}(E),\prec _{sup})\).
Proof
(A1) and (A3) are trivially satisfied. For (A4), if \(D_{+}^{sup}(Q,P)=0\), then for all X∼Q,Y∼P, Ed_{+}(X,Y)=0. Markov’s inequality entails that for all ε>0, d_{+}(X,Y)≤ε a.s. Hence, d_{+}(X,Y)=0 a.s., i.e X≤Y a.s. for all X∼Q,Y∼P. This can only hold if the support of Q is completely to the left of the support of P. The converse direction is trivial: if Q≺_{ sup }P, then for all couplings X∼Q, Y∼P, X≤Y a.s., and thus supX∼Q,Y∼PEd_{+}(X,Y)=0. □
Dual representation of maximal onesided weak risk excess measure
A dual representation of the maximal onesided weak risk excess measure \(D_{+}^{sup}\) associated with the compound risk excess measure \(D^{c}_{+}(X,Y)={Ed}_{+}(X,Y)\) of the form in (31) is given in the following theorem:
Theorem 2
(Dual Representation) Let E be a Polish space, supplied with the onesided hemimetric d_{+}, and let \(D_{+}^{c}(X,Y)={Ed}_{+}(X,Y)\) be the corresponding compound excess risk measure,

if d_{+} is upper or lower semicontinuous, then duality holds:
$$D_{+}^{sup}(Q,P)=\inf_{\Psi_{d+}}\left\{ \int f dQ+\int gdP \right\},$$where
$$\begin{array}{@{}rcl@{}} \Psi_{d_{+}}:=&\{&(f,g)\in Lip^{1}(d_{+})\times Lip^{1}(d_{}), f(x)\ge 0, g(y)\ge 0,\\&& f(x)+g(y)\ge d_{+}(x,y), (x,y)\in E^{2} \}. \end{array} $$ 
if d_{+} is upper semicontinuous, then the supremum is attained for some probability measure.
Proof

Since a lower or upper semicontinuous function is a supremum or infimum of continuous functions, d_{+} is a Baire function. Hence, the duality Theorem 2.3.8 (a) in Rachev and Rüschendorf (1998) applies, since d_{+}≥0 is obviously majorized from below (i.e., belongs to \(\mathcal P_{m}(S)\) in the notation of Theorem 2.3.8 in Rachev and Rüschendorf (1998)). Therefore, Theorem 2.3.8 (a) entails
$$ \sup\left\{\int d_{+}(x,y)\mu(dx,dy)\right\}=\inf\{\int fdQ+\int gdP \}, $$(51)where the infimum on the right side is taken in
$$\Psi_{1}:=\{f\in L_{1}(Q), g\in L_{1}(P), d_{+}(x,y)\le f(x)+g(y), (x,y)\in E^{2}\}.$$Let γ_{1},γ_{2} two realvalued constants s.t. γ_{1}+γ_{2}=0 and set for (f,g)∈Ψ_{1}, \((\tilde f:=f\gamma _{1},\tilde g:=g\gamma _{2})\). Then, \((\tilde f, \tilde g)\in \Psi _{1}\) and \(J(f,g)=\int fdQ+\int gdP\) remains invariant when one replaces (f,g) by \((\tilde f, \tilde g)\), i.e., \(J(f,g)=J(\tilde f,\tilde g)\). Therefore, if f takes some negative values, then, setting γ_{1}= inff(x) entails \(\tilde f\ge 0\) and the infimum in (51) can be restricted to
$${} \Psi_{2}:=\{f\in L_{1}(Q), g\in L_{1}(P), f(x)\ge 0, d_{+}(x,y)\le f(x)+g(y), (x,y)\in E^{2}\}. $$By symmetry, the infimum in (51) can further be restricted to
$${}\Psi_{3}:= \!\{\!f \!\in\! L_{1}\! (Q), g \!\in\! L_{1}(\! P), f\! (x)\! \ge\! 0, g\! (y)\! \ge\! 0, d_{+}\! (x,y \!)\! \le \!f\! (x)+g (y),\! (x,y)\! \in\! E^{2}\}. $$Assume d_{+} is upper bounded. For (f,g)∈Ψ_{3}, set f_{∗}(y):= supx(d_{+}(x,y)−f(x)) and f_{∗∗}(x):= supy(d_{+}(x,y)−f_{∗}(y)). Then, (f_{∗∗},f_{∗})∈Ψ_{1}, g≥f_{∗}, f≥f_{∗∗}. Hence, J(f,g)≥J(f_{∗∗},f_{∗}). Moreover, by the triangle inequality,
$$\begin{array}{@{}rcl@{}} d_{+}(x,y)g^{*}(y)&\le& d_{+}(x,x')+d(x',y)f(y) \end{array} $$and taking the supremum in y yields
$$\begin{array}{@{}rcl@{}} f_{**}(x)f_{**}(x')&\le& d_{+}(x,x'). \end{array} $$Hence, f_{∗∗}∈Lip^{1}(d_{+}), whereas a similarly calculation shows that f_{∗}∈Lip^{1}(d_{−}). Therefore, the infimum in (51) can further be restricted to \(\Psi _{d_{+}}\), as claimed.
The general case, for d_{+} unbounded, proceeds by approximation, as in Theorem 1.

Follows from Theorem 2.3.10 in Rachev and Rüschendorf (1998).
□
Examples of maximal extensions
We discuss for some of the examples in Section 4 the corresponding worstcase risk excess \(D_{+}^{sup}\). First, we consider the discrete onesided hemimetric \(d_{+}^{\le }\) of (12) on \(E=\mathbb {R}^{d}\), supplied with the product order ≤. The associated compound risk excess measure is given by (39):
for X∼Q,Y∼P, and its minimal extension (41) coincides with the induced risk excess measure \(D_{+}^{st}\) (see (10)) compatible with the stochastic order. The maximal extension is given in the following proposition:
Proposition 9
(Maximal Risk excess for stochastic ordering)

Let \(D_{+}^{\le,sup}\) be the onesided weak risk excess measure on \((\mathcal {M}^{1}(\mathbb {R}),\prec _{sup})\) obtained by maximal extension of the discrete compound risk measure \(D_{+}^{c}\) in (39). \(D_{+}^{\le,sup}\) has the representation:
$$ D_{+}^{\le,sup}(Q,P)=1\sup_{x\in\mathbb{R}^{d}}(F(x)G(x)), $$(52)where F,G are the c.d.f.s of Q,P, respectively.

The restriction of \(D_{+}^{\le,\sup }\) on E, obtained by setting \(d^<_{+}(x,y):= D_{+}^{\le,\sup }(\delta _{x},\delta _{y})\), defines a weak onesided hemimetric compatible with the strict order <, i.e.,
$$ d^<_{+}(x,y)=1_{x\ge y}, $$with d+< satisfying axioms (A1), (A3), and (A4) for the strict order < associated with ≤.
Proof

Note that by Strassen theorem, (see, e.g., Rachev and Rüschendorf (1998) in Theorems 3.5.1 and 3.5.5 or Rüschendorf (1991) in Theorems 4 and 5),
$$\begin{array}{@{}rcl@{}} D_{+}^{\le,sup}(Q,P)&=&\sup_{X\sim Q, Y\sim P}\mu(X\nleq Y)=1\inf_{X\sim Q, Y\sim P} \mu(X\le Y)\\ &=&1\sup(Q(B_{1})+P(B_{2})1), \end{array} $$where the supremum is over all pair of subsets B_{1},B_{2}⊂E s.t. B_{1}×B_{2}⊂B:={(x,y);x≤y}. But for B_{1}×B_{2}⊂B, it follows that \(B_{1}^{\downarrow }\times B_{2}^{\uparrow }\subset B\), where \(B_{1}^{\downarrow }=\{x\in \mathbb {R}^{d}:\exists \bar {x}\in B_{1} \text { s.t.} x\le \bar {x}\}\) and \(B_{2}^{\uparrow }=\{y\in \mathbb {R}^{d}:\exists \bar {y}\in B_{2} \text { s.t.} y\ge \bar {y}\}\) are the decreasing, resp. increasing, completions of B_{1},B_{2}. Then, it is easy to see that one can enlarge \(B_{1}^{\downarrow }, B_{2}^{\uparrow }\) to intervals of the form (−∞,x], [x,∞). As a result the maximal extension is given by
$$\begin{array}{@{}rcl@{}} D_{+}^{\le,sup}(Q,P)&=&2\sup_{x\in\mathbb{R}^{d}}\{F(x)+\overline G(x)\}\\ &=&1\sup_{x\in\mathbb{R}^{d}}\{F(x)G(x)\}, \end{array} $$where \(\overline {G}(x)=P([x,\infty))\).

Formula (52) yields
$$D_{+}^{\le,sup}(\delta_{x},\delta_{y})=1\sup_{z\in\mathbb{R}^{d}}\{1_{z\ge x}1_{z\ge y}\}=1_{x\ge y}. $$
□
Remark 11
Comparing this result with those of Proposition 2 and Example 7, one sees that the discrete onesided hemimetric \(d_{+}^{\le }(x,y)=1_{y\nleq x}\) and the corresponding compound risk excess measure has many extensions on \(\mathcal {M}^{1}(\mathbb {R}^{d})\) and, in particular, we obtain
The following diagram illustrates the different embeddings of structures, through their hemimetrics:
Next, we investigate the maximal onesided weak risk excess extension for the basic hemimetric (13): on \(E=\mathbb {R}\), for X∼F,Y∼G, let \(D_{+}^{c}(X,Y)=E(XY)_{+}\) be the average risk excess as in (43). The maximal risk excess extension by mass transportation is given by the following proposition.
Proposition 10
(Risk excess from exceedance in average) Let \(D_{+}^{b,sup}(Q,P)\) be the maximal onesided weak risk excess extension, obtained by mass transportation of the compound risk excess measure \(D_{+}^{c}(X,Y)=E(XY)_{+}\). One has the representation
where F,G are the c.d.f.s of Q,P, respectively.
Proof
The argument for the maximal risk excess extension is similar to that of the minimal risk excess extension. □
In the previous propositions, the order induced by the maximal extension is very strong. For insurance applications, in particular for comparing tail risk, it is of interest to restrict the comparisons to the upper tails of the distributions, see Proposition 7 in Section 4. Finally, we give the result for the tail excess compound risk measure \(D^{c,\alpha }_{+}(X,Y)\) in (47), which induces a more interesting order:
Proposition 11
(Tail risk excess)

Let 0<α<1, then the maximal extension \(D_{+}^{\alpha,sup}\) is given by
$$ D_{+}^{\alpha,sup}(Q,P)=(1\alpha)D^{sup}_{+}(Q^{\alpha},P^{\alpha}), $$(54)where Q^{α},P^{α} are the conditional distributions of Q,P on their upper αquantiles intervals [q_{ α }(Q),∞),[q_{ α }(P),∞).

Correspondingly, a suitable consistent ordering ≺_{ α } on \(\mathcal {M}^{1}(\mathbb {R})\) is given by
$$ Q\prec_{\alpha} P \Leftrightarrow G^{1}(u)\le F^{1}(1u+\alpha), \text{ for all} \alpha\le u\le 1, $$where F,G are the c.d.f.s of Q,P. For the maximal extension, the random variables are chosen countermonotonic in the upper part of the distribution.
Proof
Similar to the proof of Proposition 10. □
Extensions with dependence constraints
Setup
In Sections 4 and 5, we considered risk excess measures D_{+}(Q,P) obtained as minimal and maximal extensions obtained by mass transportation of a compound risk excess measure, i.e., over the class of all dependence structures of (Q,P). In this section, we consider a relevant modification of this method by restricting the class of possible dependence structures. This setup allows to take into consideration some known side information on the dependence structure of (Q,P), like various bounds on positive or negative dependence, see e.g., Rüschendorf (2013) in Chapter 5.
We consider the setup \(E=\mathbb {R}\) with hemimetric d_{+} and the compound excess risk measure \(D_{+}^{c}(X,Y)={Ed}_{+}(X,Y)\) of the kind (6), where \(X,Y\in \mathfrak {X}\) have marginals Q,P. If C=C_{X,Y} is a copula of (X,Y), we also write E_{ C }d_{+}(X,Y) to stress the dependence on C, and we denote by \(\mathcal {C}\) the set of all bivariate copula functions. Let \(\mathcal {D}\subset \mathcal {C}\) denote a subclass of copulas which describe the information on the dependence structure. Then, it is natural to consider the worst and bestcase extension of \(D_{+}^{c}\) over \(\mathcal {D}\).
Definition 10
(Minimal and maximal extension with dependence restriction) For a subclass \(\mathcal {D}\subset \mathcal {C}\):

The minimal extension with dependence restriction \(\mathcal {D}\) of \(D_{+}^{c}\) is defined as
$$ D_{+}^{\mathcal{D},inf}(Q,P):=\inf \{E_{C} d_{+}(X,Y), X\sim Q, Y\sim P, C\in\mathcal{D} \}. $$(55) 
Similarly, the maximal extension with dependence restriction \(\mathcal {D}\) is defined as
$$ D_{+}^{\mathcal{D},sup}(Q,P):=\sup \{E_{C} d_{+}(X,Y),X\sim Q, Y\sim P, C\in\mathcal{D} \}. $$(56)
In the case without dependence restriction, i.e., when \(\mathcal {D}=\mathcal {C}\), we get the minimal and maximal extensions \(D_{+}^{inf}\), \(D_{+}^{sup}\) of (32) and (49) considered in Sections 4 and 5.
Remark 12
By the previous discussion of Section 4 (see Proposition 4), it is clear that \(D_{+}^{\mathcal {D},inf}\) is a risk excess measure on \(\left (\mathcal {M}^{1}(E),\preceq _{st}\right)\) only in case that \(\mathcal {D}\) contains the upper Fréchet bound M, defined by M(u,v)= min(u,v),0≤u,v≤1. So typically the restricted extensions will not satisfy the properties (A2) and (A4) of a onesided risk excess measure on \(\left (\mathcal {M}^{1}(E),\preceq _{st}\right)\).
Despite that, the extensions (55) and (56) have a natural motivation as best, resp., worstcase excess risk taking into account the dependence restrictions. On the level of random variables, the class of pairs (X,Y) with \(C_{XY}\in \mathcal {D}\) and X≤Y may be empty even if Q≼_{ st }P. Therefore, the unrestricted extensions \(D_{+}^{inf}\), resp., \(D_{+}^{sup}\), would under, resp., over estimate the real risk excess. As a consequence, this is a strong indication for the relevance of the notion of minimal, resp., maximal risk excess with dependence restriction \(\mathcal {D}\).
Explicit results for extensions with positive and negative dependence restriction
We now consider two particular classes of dependence restrictions \(\mathcal {D}\) which allow determination of the minimal, resp., maximal, extensions in explicit form. Denote for copulas \(C_{0}, C_{1}\in \mathcal {C}\) by
and by
the class of all copulas which are smaller than C_{0}, resp., bigger than C_{1}, in the lower orthant ordering ≼_{ lo } (equivalently in the upper orthant ordering ≼_{ uo }). (57) describes a negative dependence restriction, (58) a positive dependence restriction: for the case C_{0}=C_{1}=Π, the independence copula Π(u,v)=uv, 0≤u,v≤1, these restrictions correspond to negatively quadrant dependent (NQD), resp., positively quadrant dependent (PQD), random variables, as defined by Lehmann (1966), see Nelsen (2006) in p. 186.
Then, for d_{+}(x,y)=(x−y)_{+}, we obtain the following explicit result.
Proposition 12
(Minimal and maximal risk excess with positive/negative dependence restriction)

For \(\mathcal {D}=\mathcal {D}_{\le }(C_{0})\), we obtain the explicit formula for the minimal risk excess extension
$$ D_{+}^{\mathcal{D},inf}(Q,P)=E_{C_{0}}\left(X^{0}Y^{0}\right)_{+}, $$(59)where X^{0}∼Q,Y^{0}∼P and \(C_{X^{0},Y^{0}}=C_{0}\phantom {\dot {i}\!}\).

For \(\mathcal {D}=\mathcal {D}_{\ge }(C_{1})\), we obtain the explicit formula for the maximal risk excess extension
$$ D_{+}^{\mathcal{D},sup}(Q,P)=E_{C_{1}}\left(X^{1}Y^{1}\right)_{+}, $$(60)where X^{1}∼Q,Y^{1}∼P and \(\phantom {\dot {i}\!}C_{X^{1},Y^{1}}=C_{1}\).
Proof

For (X,Y) with X∼Q,Y∼P and C_{X,Y}=C≤C_{0}, it follows from the submodularity argument, as in the proof of Proposition 6 that
$$E(XY)_{+}\ge E(X^{0}Y^{0})_{+},$$since f(x−y)=(x−y)_{+} is submodular and (X,Y)≤_{ sm }(X^{0},Y^{0}), with ≤_{ sm } the supermodular ordering. Taking the infimum yields the result.

The argument is similar.
□
Remark 13

Taking for \(\mathcal {D}\) the twosided dependence information
$$ \mathcal{D}=\mathcal{D}(C_{0},C_{1})=\{C\in \mathcal{C}; C_{1}\le C\le C_{0}\}, $$we obtain for \(D_{+}^{\mathcal {D},inf}\) the same formula as in (59) and for \(D_{+}^{\mathcal {D},sup}\) the same formula as in (60). Thus, this information simultaneously shrinks the upper and the lower bound for the risk excess.

The concept of minimal, resp., maximal risk excess can also be introduced for the general case (E,≤) and general compound risk excess measures \(D_{+}^{c}\). In this case, \(\mathcal {D}\) denotes a class of dependence structures of random elements X,Y∈E. Even if \(D_{+}^{inf}\) and \(D_{+}^{sup}\) do not satisfy on the level of distributions the risk excess measure axioms (A2) and (A4), they describe the relevant bounds for the risk excess with dependence information \(\mathcal {D}\).
Conclusion
We proposed a quantitative onesided comparison of probabilistic risks via the concept of risk excess measures, obtained as order extensions of hemimetrics on the underlying space E. Like for the case of risk measures, the choice of a suitable hemimetric and corresponding excess risk measure for a particular application will depend on the problem considered and the notion of order one wants to quantify. For reliability, insurance mathematics, finance, epidemiology, etc... different notions of orders and distances are related to the problem at hand. In this regard, the examples proposed, together with their explicit formulas, are helpful. Together with the extension/restriction properties of Section 3, and the dual representations of Sections 4 and 5, they can serve as a guide for the interpretation of the excess risk measure and coherence w.r.t. order and distance on the ambient space E.
We leaved aside the statistical aspects, but let us just mention that one can obtain empirical versions of the various risk excess measures D_{+}(P,Q) presented here by replacing P,Q in their definitions by the corresponding empirical measures P_{ n },Q_{ n }. For excess risk measures which have an explicit formula, statistical estimation is straightforward by plugging in the empirical measures P_{ n },Q_{ n } instead of P,Q. For the \(\mathcal {F}\)induced risk excess measures of Section 3, and for risk excess measures obtained by minimal and maximal extensions (Sections 4 and 5) of a compound one, their dual representation as a supremum (or infimum) over a functional class allows to consider their estimation via Glivenko–Cantellitype theorems indexed by function classes. This is one supplementary interest of these dual formulations. For example, for the \(\mathcal {F}\)induced risk excess measure of (26), since x_{+}≤x, one has obviously that
i.e., the risk excess measure is majorized by the corresponding integral probability metric and the convergence of the latter follows from classical results on abstract empirical process, see e.g., Sriperumbudur et al. (2012).
References
Artzner, P, Delbaen, F, Eber, JM, Heath, D: Coherent measures of risk. Math. Finance. 9(3), 203–228 (1999). https://doi.org/10.1111/14679965.00068.
Berkes, I, Philipp, W: Approximation theorems for independent and weakly dependent random vectors. Ann. Probab. 7(1), 29–54 (1979).
Burgert, C, Rüschendorf, L: Consistent risk measures for portfolio vectors. Insurance Math. Econom. 38(2), 289–297 (2006). https://doi.org/10.1016/j.insmatheco.2005.08.008.
Cambanis, S, Simons, G, Stout, W: Inequalities for E k(X,Y) when the marginals are fixed. Z. Wahrscheinlichkeitstheorie und Verw. Gebiete. 36(4), 285–294 (1976). https://doi.org/10.1007/BF00532695.
Capéraà, P, Van Cutsem, B: Méthodes et Modèles en Statistique Non Paramétrique. Les Presses de l’Université Laval, SainteFoy, QC; Dunod, Paris (1988). Exposé fondamental. [Basic exposition], With a foreword by Capéraà, Van Cutsem and Alain Baille.
Delbaen, F: Coherent risk measures on general probability spaces. In: Advances in Finance and Stochastics, pp. 1–37. Springer, Berlin (2002).
Dudley, RM: Distances of probability measures and random variables. Ann. Math. Statist. 39, 1563–1572 (1968). https://doi.org/10.1007/9781441958211_4.
Dudley, RM: Probabilities and Metrics. Matematisk Institut, Aarhus Universitet, Aarhus (1976). Convergence of laws on metric spaces, with a view to statistical testing, Lecture Notes Series, No. 45.
Dudley, RM: Real Analysis and Probability. Cambridge Studies in Advanced Mathematics, Vol. 74. Cambridge University Press, Cambridge (2002). https://doi.org/10.1017/CBO9780511755347. Revised reprint of the 1989 original.
Faugeras, OP, Rüschendorf, L: Markov morphisms: a combined copula and mass transportation approach to multivariate quantiles. Math. Applicanda. 45, 3–45 (2017).
Föllmer, H, Schied, A: Stochastic Finance. De Gruyter Studies in Mathematics, Vol. 27. Walter de Gruyter & Co., Berlin (2002). https://doi.org/10.1515/9783110198065. An introduction in discrete time.
GoubaultLarrecq, J: NonHausdorff Topology and Domain Theory. New Mathematical Monographs, Vol. 22. Cambridge University Press, Cambridge (2013). https://doi.org/10.1017/CBO9781139524438. [On the cover: Selected topics in pointset topology].
Jouini, E, Meddeb, M, Touzi, N: Vectorvalued coherent risk measures. Finance Stoch. 8(4), 531–552 (2004). https://doi.org/10.1007/s0078000401276.
Kellerer, HG: Duality theorems for marginal problems. Z. Wahrsch. Verw. Gebiete. 67(4), 399–432 (1984). https://doi.org/10.1007/BF00532047.
Koenker, R: Quantile Regression. Econometric Society Monographs, Vol. 38. Cambridge University Press, Cambridge (2005). https://doi.org/10.1017/CBO9780511754098.
Lehmann, EL: Some concepts of dependence. Ann. Math. Statist. 37, 1137–1153 (1966). https://doi.org/10.1214/aoms/1177699260.
Marshall, AW, Olkin, I, Arnold, BC: Inequalities: Theory of Majorization and Its Applications. 2nd edn. Springer Series in Statistics. Springer (2011). https://doi.org/10.1007/9780387682761.
Müller, A: Integral probability metrics and their generating classes of functions. Adv. in Appl. Probab. 29(2), 429–443 (1997).
Nachbin, L: Topology and Order. Translated from the Portuguese by Lulu Bechtolsheim. Van Nostrand Mathematical Studies, No. 4. D. Van Nostrand Co., Inc., Princeton, N.J.Toronto, Ont.London (1965).
Nelsen, RB: An Introduction to Copulas. 2nd edn. Springer Series in Statistics. Springer, New York (2006).
Rachev, ST, Rüschendorf, L.: Approximation of sums by compound Poisson distributions with respect to stoploss distances. Adv. in Appl. Probab. 22(2), 350–374 (1990).
Rachev, ST: Probability Metrics and the Stability of Stochastic Models. Wiley Series in Probability and Mathematical Statistics: Applied Probability and Statistics. John Wiley & Sons, Ltd., Chichester (1991).
Rachev, ST, Rüschendorf, L: Mass Transportation Problems. Vol. I. Probability and its Applications (New York), Vol. 1. SpringerVerlag, New York (1998). Theory.
Rachev, ST, Klebanov, LB, Stoyanov, SV, Fabozzi, FJ: The Methods of Distances in the Theory of Probability and Statistics. Springer (2013). https://doi.org/10.1007/9781461448693.
Rosenberger, J, Gasko, M: Understanding robust and exploratory data analysis. Wiley Classics Library. WileyInterscience, New York (2000). Chap. Comparing Location Estimators: Trimmed Means, Medians, and Trimean. Revised and updated reprint of the 1983 original.
Rüschendorf, L.: Monotonicity and unbiasedness of tests via a.s. constructions. Statistics. 17(2), 221–230 (1986). https://doi.org/10.1080/02331888608801931.
Rüschendorf, L: Fréchet bounds and their applications. In: Dall’Aglio, G, Kotz, S, Salinetti, G (eds.)Advances in Probability Distributions with Given Marginals: Beyond the Copulas, pp. 151–187. Springer, Dordrecht (1991). https://doi.org/10.1007/9789401134668.
Rüschendorf, L.: On the distributional transform, Sklar’s theorem, and the empirical copula process. J. Statist. Plann. Inference. 139(11), 3921–3927 (2009). https://doi.org/10.1016/j.jspi.2009.05.030.
Rüschendorf, L.: Mathematical Risk Analysis. Springer Series in Operations Research and Financial Engineering. Springer (2013). https://doi.org/10.1007/9783642335907. Dependence, risk bounds, optimal allocations and portfolios.
Sriperumbudur, BK, Fukumizu, K, Gretton, A, Schölkopf, B, Lanckriet, GRG: On the empirical estimation of integral probability metrics. Electron. J. Stat. 6, 1550–1599 (2012).
Strassen, V: The existence of probability measures with given marginals. Ann. Math. Statist. 36, 423–439 (1965). https://doi.org/10.1214/aoms/1177700153.
Tchen, AH: Inequalities for distributions with given marginals. Ann. Probab. 8(4), 814–827 (1980).
Villani, C: Topics in Optimal Transportation. Graduate Studies in Mathematics, Vol. 58. American Mathematical Society (2003). https://doi.org/10.1007/b12016.
Zolotarev, VM: Modern Theory of Summation of Random Variables. Modern Probability and Statistics. VSP, Utrecht (1997). https://doi.org/10.1515/9783110936537.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Competing interests
The authors declare that they have no competing interests.
Additional information
Author’s contributions
This paper is common work of both authors; both read and approved the final manuscript.
Rights and permissions
Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License(http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.
About this article
Cite this article
Faugeras, O.P., Rüschendorf, L. Risk excess measures induced by hemimetrics. Probab Uncertain Quant Risk 3, 6 (2018). https://doi.org/10.1186/s4154601800320
Received:
Accepted:
Published:
DOI: https://doi.org/10.1186/s4154601800320
Keywords
 Risk measure
 Mass transportation
 Hemimetric
 Stochastic order
AMS Subject Classification
 Primary
 60B05; Secondary
 62P05
 91B30