# Risk excess measures induced by hemi-metrics

## Abstract

The main aim of this paper is to introduce the notion of risk excess measure, to analyze its properties, and to describe some basic construction methods. To compare the risk excess of one distribution Q w.r.t. a given risk distribution P, we apply the concept of hemi-metrics on the space of probability measures. This view of risk comparison has a natural basis in the extension of orderings and hemi-metrics on the underlying space to the level of probability measures. Basic examples of these kind of extensions are induced by mass transportation and by function class induced orderings. Our view towards measuring risk excess adds to the usually considered method to compare risks of Q and P by the values ρ(Q), ρ(P) of a risk measure ρ. We argue that the difference ρ(Q)−ρ(P) neglects relevant aspects of the risk excess which are adequately described by the new notion of risk excess measure. We derive various concrete classes of risk excess measures and discuss corresponding ordering and measure extension properties.

## Introduction

### Motivation

The evaluation and comparison of risks are basic tasks of risk analysis. For the evaluation of risks, the notion of risk measures—in particular of coherent and convex risk measures—has been introduced in an axiomatic way for real risks in Artzner et al. (1999), Delbaen (2002), Föllmer and Schied (2002) and has been extended to vector risks in Jouini et al. (2004), Burgert and Rüschendorf (2006), and many others. This notion leads to the comparison of two risks X,Y (resp., distributions Q,P) by ρ(X)−ρ(Y) (resp., ρ(P)−ρ(Q)). If the main interest is to compare a risk X to a benchmark risk Y w.r.t. a common risk measure ρ, then the one-sided distance

$$D_{+}(X,Y)=(\rho(X)-\rho(Y))_{+},$$
(1)

respectively,

$$D_{+}(Q,P)=(\rho(Q)-\rho(P))_{+},$$
(2)

is the induced comparison of risks (where x+= max(x,0) denotes the positive part of x).

We argue that the comparisons in (1), (2) neglect some relevant part of measuring the risk excess. This deficit can be seen in the analog simple case where for the basic space $$E=\mathbb {R}^{d}$$, the risk of a vector $$\mathbf {x}=(x_{1},\ldots,x_{d})\in \mathbb {R}^{d}$$ is measured by the Euclidean norm ρ(x)=|x|. In this case,

$$D_{+}(\mathbf{x},\mathbf{y})=(\mathbf{|x|}-\mathbf{|y|})_{+}$$
(3)

gives a quantitative comparison of the new risk x w.r.t. a benchmark risk y, which is not informative enough. If |x|=|y|, then the comparisons in (3) would not take into account whether some or many components of x might be essentially larger than those of y. A better measure for the risk excess would be

$$D_{+}(\mathbf{x},\mathbf{y})=\sum_{i=1}^{d} (x_{i}-y_{i})_{+}.$$
(4)

Another motivation comes from the fact that some concepts which have an impact on the notion of risk are better defined in a relative manner than in absolute terms: for example, the concept of “heavy tailedness” of a distribution (and the subsequent idea of “tail risk”) is easier to define by comparing the “size of the tail” or “speed of decrease of the density” of the distribution F to the corresponding “size of the tail” or “speed of decrease of the density” of a benchmark distribution G (say, the standard Gaussian one). These comparisons can be operationalized in a quantitative measure of tail risk, e.g., by computing the difference of mass of the distribution F over an α-quantile w.r.t. to the corresponding mass for the benchmark distribution G over the same α-quantile, viz.,

$$T_{\alpha}(F,G):=\int_{\alpha}^{1}\left(F^{-1}(u)-G^{-1}(u)\right)_{+} du$$

or, for operationalizing the comparisons of “speed of decrease of the density” by something like,

$$\tau_{\alpha}(F,G):=\frac{F^{-1}(\alpha)-F^{-1}(0.5)}{F^{-1}(0.75)-F^{-1}(0.5)}\times \left(\frac{G^{-1}(\alpha)-G^{-1}(0.5)}{G^{-1}(0.75)-G^{-1}(0.5)}\right)^{-1}$$

see, e.g., Capéraà and Van Cutsem (1988) in p. 45, Rosenberger and Gasko (2000). See also the motivation in Section 4.

### Outline

In this paper, we propose to measure the risk excess of a risk distribution Q over a given risk distribution P by a hemi-metric on the space of probability measures. Hemi-metrics are a suitable tool for one-sided comparison of risks. When measuring the risk excess of Q compared to P, it is natural to associate a one-sided distance

$$D_{+}(Q,P)$$

on the space $$(\mathcal {M}^{1}(E),\preceq)$$ of probability measures, where is a given stochastic (pre)order (see the forthcoming definition 3 in Section 2). The stochastic order is related to the ordering ≤ on the underlying space E. This allows to consider for a quantitative one-sided comparison of risks at the level of probability measures as an extension of the order and distance structure on E.

We discuss several classes of risk excess measures D+(Q,P) and consider the question when these are given as order extensions of hemi-distances d+ on the underlying space E. Several relevant hemi-distances are induced by mass transportation and thus give access to natural interpretation. One particular extension is given by a version of the Kantorovich–Rubinstein theorem for hemi-distances. The paper develops basic tools and notions for measuring the one-sided risk excess of a risk distribution Q compared to P.

The paper is organized as follows: in Section 2, we introduce the notion of hemi-metrics which are basic for obtaining a quantitative description of one-sided distance in a preordered space (E,≤). The risk excess measure D+(Q,P) of Q w.r.t. P is then introduced as a one-sided hemi-metric on the space of probability measures $$\mathcal {M}^{1}(E)$$. The ordering on $$\mathcal {M}^{1}(E)$$ is chosen consistent with the preorder ≤ on E and describing a positive risk excess, i.e., QP if Q has no positive risk excess w.r.t. P. We discuss several examples to describe the meaning of this notion and the interplay of order and distance.

In Section 3, we study several classes of interesting risk excess comparison measures and corresponding extension properties of the preorderings on the underlying space. A general class of risk comparison measures is introduced by considering worst-case comparison over suitable classes of increasing functions. This is analog to the worst-case representation of convex and coherent risk measures. There are several classes of examples.

In Section 4, we describe risk excess measures D+(X,Y) on the space of random variables. The class of compound risk excess measures is obtained for those measures which depend only on the joint law of the random elements (X,Y). Mass transportation gives a natural way to obtain minimal extensions of compound risk excess measures to risk excess measures in the space of distributions, i.e., which depend only on the marginal laws of X and Y. Dual representations of these risk excess measures are obtained by a version of the Kantorovich–Rubinstein theorem for hemi-metrics. Several examples illustrate these constructions.

In Section 5, we introduce the concept of weak risk excess measure, which is a risk excess measure without the weak identity property. Similarly to Section 4, a mass transportation formulation gives a way to obtain weak risk excess measures as the maximal extension of compound risk excess measures. We also give a dual representation of this risk excess measure and introduce several examples of weak excess risk measures constructed from mass transportation problems.

Finally, in Section 6, we consider dependence restrictions on the class of risk pairs (X,Y) and consider maximal and minimal excess risks with these restrictions. These maximal and minimal excess risks do not define risk excess measures, but give relevant and well-motivated bounds. For one and two-sided restrictions, we obtain explicit formulas for the bounds.

## Hemi-metrics and measuring risk excess

### Hemi-metrics

As a motivation for the introduction of measuring the risk excess of distributions, one could argue that, from the structural and phenomenological point of view, the concept of risk combines aspects of the metric structure (a risk measure evaluates some “size” or “norm” on the space of distributions) and of the order structure (there is an underlying preorder structure on the space of distributions which allows one to say when one risk is larger than another). Such “quantitative measure of the order” is encapsulated in the notion of hemi-metric, see Goubault-Larrecq (2013)in Chap. 6, p. 203. (The terminology is not completely standard and the notion of hemi-metric is also known of as pseudo quasi-metric in the topology literature, while Nachbin (1965)in p. 61 calls it a semi-metric). We use the following definition:

### Definition 1

(Hemi-metric) A hemi-metric or hemi-distance d+ on a set E is an application $$d_{+}:E\times E\to \overline {\mathbb {R}}$$ which satisfies the following axioms: for all x,y,zE,

• positivity: d+(x,y)≥0;

• weak identity: x=yd+(x,y)=0;

• triangle inequality: d+(x,z)≤d+(x,y)+d+(y,z).

The main difference with the notion of metric is the omittance of the symmetry condition, and assuming only the weak identity property. For establishing a connection with a preorder ≤ on E, we introduce the notion of a one-sided hemi-metric.

### Definition 2

(One-sided hemi-metric) Let d+ be a hemi-metric on a preordered set (E,≤). Then, d+ is called a one-sided hemi-metric on (E,≤) if

• xyd+(x,y)=0.

For two comparable elements, the one-sided hemi-metric of a smaller element x to a larger element y is zero.

### Remark 1

• If E is a set and d+ a hemi-metric on E, one can endow E with a preorder structure by setting

$$x\le y \Leftrightarrow d_{+}(x,y)=0.$$
(5)

Then, by construction of ≤, we obtain that d+ is a one-sided hemi-metric on E.

• Hemi-norms and hemi-metrics:

When E has a vector space structure, a metric d can be induced in a natural way by a norm ρ, as d(x,y):=ρ(xy). Similarly, a hemi-norm ρ+ on E, (i.e., a subadditive, positive homogeneous, non-negative functional $$\rho _{+}:E\to \overline {\mathbb {R}}$$ satisfying the weak separation condition x=0 E ρ+(x)=0) defines a hemi-metric d+ by setting

$$d_{+}(x,y):=\rho_{+}(x-y).$$
(6)

In addition, if E has a preorder ≤ and ρ+ is a hemi-norm which has the property that

$$x\le 0_{E} \Leftrightarrow \rho_{+}(x)=0,$$
(7)

then d+ in (6) defines a one-sided hemi-metric.

More generally, if (E,≤,ρ) is a lattice-ordered normed vector space, one can construct a one-sided hemi-metric compatible with ≤ by setting

$$d_{+}(x,y):=\rho((x-y)\vee 0_{E}),$$

where is the least upper bound operation.

• To any hemi-metric d+ on E, one can associate its dual hemi-metric d, obtained by symmetrization of d+,

$$d_{-}(x,y):=d_{+}(y,x).$$
(8)

When d+ is a one-sided hemi-metric associated with the order ≤ on E, d is a one-sided hemi-metric associated with the corresponding dual order ≥ on E.

A hemi-metric d+ induces a distance d by symmetrization

$$d^{\infty}(x,y):=\max(d_{+}(x,y),d_{-}(x,y)),$$

or by taking the positive linear combination, say

$$d^{1}(x,y):=\alpha d_{+}(x,y)+\beta d_{-}(x,y), \quad \alpha,\beta >0.$$

More generally, a hemi-metric allows defining a “one-sided” topology by setting the open balls as

$$B^{+}(x,r):=\{y\in\mathcal X, d_{+}(x,y)<r\}.$$
(9)
• The concept of a hemi-metric is implicit in several notions encountered in analysis, probability, and statistics. For example, recall that a real-valued function f on a metric space (E,d) is upper semi-continuous at x0 iff

$$\forall \epsilon>0, \exists \delta>0, d(x,x_{0})\le\delta\Rightarrow d_{+}^{b}(f(x),f(x_{0}))\le \epsilon,$$

where $$d_{+}^{b}(x,y):=\rho _{+}(x-y)=\max (x-y,0)$$ is the usual basic one-sided hemi-metric on $$(\mathbb {R},\le,|.|)$$ (see Example 3 and (13) below).

### Risk excess measures

After the discussion of hemi-metrics, we are now in a position to introduce the main object of this paper, which is a measure of the risk excess of a distribution Q w.r.t. P. To that aim, we assume that a preorder is defined on the set $$\mathcal {M}^{1}(E)$$ of probability measures on a measurable space $$(E,\mathcal {E})$$: PQ describes that Q has more risk than P.

### Definition 3

(Risk excess measure) A risk excess measure D+ is defined as an one-sided hemi-metric on the preordered space $$\left (\mathcal {M}^{1}(E),{\preceq }\right)$$, (or on a subset $$\mathcal {M}\subset \mathcal {M}^{1}(E)$$). D+(Q,P) is called the risk excess of Q w.r.t. P.

We illustrate below this concept with the following examples. A general class of risk excess measure will be presented in a systematic way in Section 3.

### Example 1

(Stochastic ordering) On $$E=\mathbb {R}^{d}$$, we consider the componentwise order ≤, which is closely connected with the stochastic order st : for a measurable set BE, define B={yE; xB s.t. yx} and say that B is an increasing set if B=B. Denote by $$\mathcal {I}(E)$$ the set of measurable increasing sets of E.

The stochastic order st is defined on $$\mathcal {M}^{1}(\mathbb {R}^{d})$$ by

$$Q\preceq_{st} P \Leftrightarrow Q(B)\le P(B),$$

for all measurable sets $$B\in \mathcal {I}(E)$$. A corresponding risk excess measure is given by

$$D_{+}^{st}(Q,P):=\sup\{ (Q(B)-P(B))_{+}; B\in\mathcal{I}(E)\}.$$
(10)

There exists no risk excess of Q w.r.t. P, i.e.,

$$\begin{array}{@{}rcl@{}} D_{+}^{st}(Q,P)=0&\Leftrightarrow&Q(B)\le P(B), \quad \forall B\in \mathcal{I}(E),\\ &\Leftrightarrow& Q\preceq_{st} P. \end{array}$$

By the well-known Strassen theorem (see Strassen (1965) and e.g., Rüschendorf (2013) in Theorem 1.18, p. 22), this is equivalent to the existence of random vectors XQ, YP s.t. XY a.s.

In other words, the distribution Q is considered more safe than P if one can construct representations X of Q and Y of P s.t. all coordinates of X are lower than those of Y. Q has a positive risk excess w.r.t. P if some of the components of any representation X of Q exceed the corresponding components of any representation Y of P. Of course, this gives a very strict notion of no risk excess.

### Example 2

(Levy–Prokhorov) Let E be a space with a hemi-metric d+. Define a “one-sided” topology on E by setting the open balls as in (9). Let $$\mathcal {E}$$ be the corresponding Borel σ−algebra. For two probability measures $$P,Q\in \mathcal {M}^{1}(E,\mathcal {E})$$, define

$$D_{+}^{LP}(Q,P)=\inf\{\epsilon>0: Q(A)\le P(A^{\epsilon})+\epsilon, A\text{open}\},$$
(11)

where Aε:={xE:aA,d+(a,x)<ε}=xAB+(x,ε). Then, $$D_{+}^{LP}$$ is a one-sided risk excess measure and $$D_{+}^{LP}(Q,P)=0$$ iff Q(A)≤P(A) for all $$A\in \mathcal {E}$$.

One can replace Aε by Aε]:={xE:aA,d+(a,x)≤ε}, and the open sets by the closed set in the definition (11), see Dudley (1968), Dudley (1976) in sect. 8, Dudley (2002) in Chap. 11.3. For the one-sidedness, if Q(A)≤P(A) for all $$A\in \mathcal {E}$$, then, for every ε>0, Q(A)≤P(A)≤P(Aε)+ε, since AAε. Hence, $$D_{+}^{LP}(Q,P)\le \epsilon$$. Letting ε0 yields $$D_{+}^{LP}(Q,P)=0$$. Conversely, if $$D_{+}^{LP}(Q,P)=0$$, there exists a sequence ε n 0 s.t. for all closed sets A, $$\phantom {\dot {i}\!}Q(A)\le P(A^{\epsilon _{n}})+\epsilon _{n}$$. Since $$A^{\epsilon _{n}}\downarrow \overline {A}=A$$, this yields Q(A)≤P(A) for all closed sets A. Hence, Q(A)≤P(A) also for all $$A\in \mathcal {E}$$.

### Examples of hemi-metrics

Hemi-metrics are suitable tools to measure one-sided distances. We illustrate the meaning of this notion and the interplay of order and distance via the following example, which will be used constantly throughout the paper.

### Example 3

(Standard examples on (E,≤))

• Discrete one-sided hemi-metric:

Let (E,≤) be a preordered space, then

$$d_{+}^{\le}(x, y)=\left\{\begin{array}{ll} 0&\text{if} x\le y\\ 1&\text{else} \end{array}\right.$$
(12)

defines a one-sided hemi-metric on (E,≤), which we call the discrete one-sided hemi-metric on (E,≤).

• lp hemi-metric:

On $$E=\mathbb {R}^{1}$$, one can decompose the absolute value into its positive and negative parts |x|=x++x=ρ+(x)−ρ+(−x), viz., into two hemi-norms satisfying (7). As a consequence of (6), the metric

$$|x-y|=d_{+}(x,y)-d_{+}(-y,-x)=d_{+}(x,y)+d_{-}(x,y)$$

is decomposed as a sum of two one-sided hemi-metrics (d+,d) associated with the dual orders (≤,≥). The basic one-sided hemi-metric

$$d_{+}^{b}(x,y):=(x-y)_{+}$$
(13)

describes in a quantitative way the ordering relationship ≤. Compared to the discrete hemi-metric (12), it also contains information on the magnitude of the one-sided departure of two elements.

Similarly on $$(E,\le)=(\mathbb {R}^{d},\le)$$ supplied with the componentwise (product) order

$$\mathbf{x\le y}\Leftrightarrow x_{i}\le y_{i}, 1\le i\le d,$$

the lp hemi-norms, defined as

$$\begin{array}{@{}rcl@{}} l^{p}_{+}(\mathbf{x})&:=&\left(\sum_{i=1}^{d} \left(x_{i}^{+}\right)^{p}\right)^{1/p}, \quad 1\le p<\infty, \\ l_{+}^{\infty}(\mathbf{x})&:=&\max \left\{x_{i}^{+}\right\} \end{array}$$
(14)

induce the one-sided lp hemi-metrics

$$d_{+}^{p}(\mathbf{x, y}):=l^{p}_{+}(\mathbf{x-y}), \quad 1\le p\le \infty.$$

Several of the hemi-metrics have a direct interpretation and extensions as risk measures for probability distributions. We give two examples:

### Example 4

• τ−quantiles:

Consider on the real line $$E=\mathbb {R}^{1}$$, the hemi-norm

$$\rho_{\tau}(x):=\tau x^{+}+(1-\tau)x^{-}=\tau x^{+}+(1-\tau)(-x)^{+},\quad 0<\tau <1$$
(15)

induces, by Remark 1 and (6), a hemi-metric

$$d_{\tau}(x,y):=\rho_{\tau}(x-y).$$
(16)

It is well known that this hemi-metric can be used to define τ−quantiles q τ (Y) (viz., the Value at Risk) of a random variable Y as a minimizer of E[ρ τ (Yy)], i.e.,

$$\begin{array}{@{}rcl@{}} q_{\tau}(Y)&:=&F^{-1}_{Y}(\tau)=\arg\inf_{y} E\left[\rho_{\tau}(Y-y)\right] \end{array}$$
(17)
$$\begin{array}{@{}rcl@{}} &=&\arg\inf_{y} E[d_{\tau}(Y,y)]={VaR}_{\tau}(Y), \end{array}$$
(18)

see Koenker (2005) in p. 5. Note, however, that the order induced by d τ reduces to the trivial order =, as d τ (x,y)=0 iff x=y.

• Half-space depth, departure in direction u:

A multivariate generalization of the preceding example can be defined as follows. On $$E=\mathbb {R}^{d}$$, we define for any unit vector u an ordering (the length in the direction u), by

$$\mathbf{x}\le_{\mathbf{u}} \mathbf{y} \Leftrightarrow \mathbf{u}^{T} (\mathbf{y}-\mathbf{x})\ge 0,$$
(19)

where xT denotes the transpose of x. With this ordering,

$$d_{+}^{\mathbf{u}}(\mathbf{x},\mathbf{y})=\left\{\begin{array}{ll} 1&\text{if} \quad\mathbf{u}^{T} (\mathbf{y}-\mathbf{x})> 0\\ 0&\text{else} \end{array}\right.$$
(20)

defines, as in (12), a one-sided hemi-metric. It is one if the length of y in direction u is greater than that of x, and is zero else.

This one-sided hemi-metric has, as basic application, the definition of the half-space depth function, which describes the degree of outlyingness of a point $$\mathbf {x}\in \mathbb {R}^{d}$$ w.r.t. a probability measure P on $$\mathbb {R}^{d}$$. It is defined as

$$\begin{array}{@{}rcl@{}} D_{+}(x,P)&:=&\inf_{\mathbf{u}\in S_{d-1}}\int d_{+}^{\mathbf{u}}(x,y)dP(y)\\ &=&\inf_{\mathbf{u}\in S_{d-1}}\int 1_{\{\mathbf{u}^{T}(\mathbf{y}-\mathbf{x})> 0\}}dP(y), \end{array}$$
(21)

where Sd−1 is the unit sphere of $$\mathbb {R}^{d}$$. Several modifications of this definition are useful to describe a one-sided degree of outlyingness (or risk) or quantitative versions of it. Two relevant examples are

$$D_{+}^{1}(x,P):=\inf_{\mathbf{u}\in S^{+}_{d-1}}\int1_{\left\{\mathbf{u}^{T}(\mathbf{y}-\mathbf{x})> 0\right\}}dP(y),$$
(22)

or

$$D_{+}^{2}(x,P):=\inf_{\mathbf{u}\in S^{+}_{d-1}}\int\left(\mathbf{u}^{T}(\mathbf{y}-\mathbf{x})\right)^{+}dP(y),$$

where $$S_{d-1}^{+}=S_{d-1}\cap \mathbb {R}^{d,+}$$ is the part of the unit sphere in the positive cone x0. We mention that a very general approach to multivariate quantiles can be found in Faugeras and Rüschendorf (2017).

At last, we briefly mention some examples of one-sided hemi-metrics which may appear in related contexts.

### Example 5

• Schur-order ≤ S on $$\mathbb {R}^{d}$$:

The majorization, or Schur order ≤ S , is useful to compare vectors $$\mathbf {x,y}\in \mathbb {R}^{d}$$ with identical sums w.r.t. their degree of dispersion, see e.g., Marshall et al. (2011). In a natural way, this ordering extends to an ordering on $$\mathcal {M}^{1}(\mathbb {R}^{d})$$, comparing the relative degree of dispersions of two measures. Let $$\mathbf {x, y}\in \mathbb {R}^{d}$$, Γ(d) the set of permutations of {1,…,d}. The Schur-ordering on $$\mathbb {R}^{d}$$ x S y is defined by,

$$\begin{array}{@{}rcl@{}} \sum_{k=l}^{d} x_{\gamma(k)}&\le& \sum_{k=l}^{d} y_{\beta(k)}, \quad l=2,\ldots, d,\\ \sum_{k=1}^{d} x_{\gamma(k)}&=& \sum_{k=1}^{d} y_{\beta(k)} \end{array}$$
(23)

where γ,βΓ(d) are the decreasing rearrangements of x and y:

$$x_{\gamma(1)}\ge x_{\gamma(2)}\ge \ldots\ge x_{\gamma(d)}, \quad y_{\beta(1)}\ge y_{\beta(2)}\ge \ldots\ge y_{\beta(d)}.$$

S is a preorder: x S y and y S x only imply that the components of each vector are equal, but not necessarily in the same order. Geometrically, x S y if and only if x is in the convex hull of all vectors obtained by permuting the coordinates of y. When x,y stands for a pair of discrete probability measures on the same set of d-points, the norming condition (23) is satisfied as the sum is normalized to one.

Say that x and y are Schur-comparable if $$\sum _{i=1}^{n} x_{i}=\sum _{i=1}^{n} y_{i}$$. The degree of dispersion is measured by the following one-sided hemi-metric: for Schur-comparable elements x,y, define

$$d_{+}(\mathbf{x}, \mathbf{y}):=\sup_{l=2,\ldots, d}\left(\sum_{k=l}^{d} [x_{\gamma(k)}-y_{\beta(k)}]\right)_{+}.$$

One has, for Schur-comparable elements:

$$\mathbf{x}\le_{S} \mathbf{y} \text{ iff} d_{+}(\mathbf{x},\mathbf{y})=0.$$

Specialized to discrete probability measures, this gives a one-sided hemi-metric measuring the degree of dispersion or “variance”.

• One-sided Hausdorff hemi-metric on closed subsets:

Let (E,d) be a metric space. Set

$$d_{+}(A,B):=\underset{y\in A\, x\in B}{\sup\inf}\ d(x,y).$$
(24)

Then, for closed sets A,B, it holds that d+(A,B)=0AB, and d+ is a one-sided hemi-metric on $$(\mathcal {C}(E),\subset)$$, the set of closed subsets of E.

## Risk excess measures induced by function classes

### Motivation and definition

For a law invariant, convex risk measure ρ on $$\mathcal {M}^{1}(\mathbb {R}^{d})$$, one has a representation of the form

$$\rho(Q)=\sup_{\nu\in \mathcal{A}}\left(E_{\nu}(X) -\alpha(\nu)\right),$$
(25)

where XQ, $$\mathcal {A}$$ is a class of scenario measures and α(ν) is a penalization term, see Föllmer and Schied (2002). This representation suggests to consider for a class $$\mathcal {F}$$ of real functions on E the following hemi-metric

$$D_{+}^{\mathcal{F}} (Q,P):=\sup_{f\in\mathcal{F}}\left(\int fd(Q-P)\right)_{+}.$$
(26)

Let $$\mathcal {M}^{\mathcal {F}}:=\{P\in \mathcal {M}^{1}(E):\sup _{f\in \mathcal {F}}\left (\int fdP\right)_{+}<\infty \}$$ and define on $$\mathcal {M}^{\mathcal {F}}$$ the preorder

$$P\preceq_{\mathcal{F}} Q \Leftrightarrow \int fdP\le \int fdQ,\quad \forall f\in\mathcal{F}.$$
(27)

Then, $$D^{\mathcal {F}}_{+}$$ is a risk excess measure on $$\left (\mathcal {M}^{\mathcal {F}},\preceq _{\mathcal {F}}\right)$$.

Another motivation comes from the theory of probability metrics, where some metrics on the space of probability measures are defined by duality from a class of functions: $$D_{+}^{\mathcal {F}}$$ in (26) is the natural one-sided analog of the probability metrics $$D^{\mathcal {F}}$$ induced by a functional class $$\mathcal {F}$$,

$$D^{\mathcal{F}}(Q,P)=\sup_{f\in\mathcal{F}}\left|\int fd(Q-P)\right|,$$

which go under the name of probability metrics with a ζ-structure in Rachev (1991) or integral probability metrics in Müller (1997). We are thus naturally inclined to define:

### Definition 4

($$\mathcal {F}$$-induced risk excess measure) The risk excess measure $$D^{\mathcal {F}}_{+}$$ on $$\left (\mathcal {M}^{\mathcal {F}},\preceq _{\mathcal {F}}\right)$$ defined in (26) is called the $$\mathcal {F}$$-induced risk excess measure.

### Example 6

Example 1 can be regarded as an $$\mathcal {F}$$-induced excess risk measure, by considering $$\mathcal {F}=\{1_{B}:B\in \mathcal {I}(E)\}$$.

### Remark 2

On a probability space $$(\Omega,\mathcal B,\mu)$$, let X be a random variable with image measure μX=Q. By (25), any law-invariant convex coherent risk measure ρ has a representation of the form $$D^{\mathcal {F}}_{+}(Q,\delta _{0})$$ where $$\mathcal {F}=\left \{x\frac {d\nu ^{X}}{d\mu ^{X}}(x), \nu \in \mathcal {A}\right \}$$, μ is an underlying measure dominating $$\mathcal {A}$$, μX and νX the image measures of μ,νby X. Indeed,

$$E_{\nu} (X)=\int Xd\nu=\int X \frac{d\nu}{d\mu}d\mu=\int x\frac{d\nu^{X}}{d\mu^{X}} d\mu^{X}=\int x\frac{d\nu^{X}}{d\mu^{X}} dQ.$$

So the notion of risk excess measure can be seen as an extension of the notion of risk measures.

### Extension and restrictions of orders and hemi-metrics

For risk excess measures, an important aspect is to have a kind of consistency w.r.t. some ordering ≤ on E, i.e., $$\mathcal {F}$$ consists of increasing functions w.r.t. ≤. In this respect, the following order extension properties are useful.

### Proposition 1

(Extension and restriction of order)

• If is a preorder on $$\mathcal {M}^{1}(E)$$, then, the relation ≤ r , defined, for x,yE, by

$$x\le_{r} y \Leftrightarrow \delta_{x}\preceq\delta_{y},$$
(28)

defines a preorder on E. ≤ r is called the restriction of the preorder on $$\mathcal {M}^{1}(E)$$.

• Conversely, if ≤ is a preorder on E, then the stochastic order st defines a partial order on $$\mathcal {M}^{1}(E)$$, such that its restriction ≤ r is identical to ≤.

### Proof

• The proof follows by direct verification.

• By definition, we have

$$\begin{array}{@{}rcl@{}} x\le_{r} y &\Leftrightarrow&\delta_{x}\preceq_{st}\delta_{y} \Leftrightarrow 1_{B}(x)\le 1_{B}(y), \forall B\in\mathcal{I}(E)\\ &\Leftrightarrow& [x\in B\Rightarrow y\in B, \forall B\in\mathcal{I}(E)]. \end{array}$$
(29)

In particular, restricted to principal up-sets B={z}, the implication (29) becomes

$$x\ge z\Rightarrow y\ge z, \text{ for all } z\in E,$$

which is equivalent to xy. Therefore, x r yxy. Conversely, if xy, (29) is satisfied, by definition of an up-set.

### Remark 3

For a closed partial order ≤ on a Polish space E, the result follows directly from Strassen theorem (see Example 1).

Analogously, we can also extend and restrict in a consistent way the discrete one-sided hemi-metric $$d_{+}^{\le }$$ of Example 3, Eq. (12) into the risk excess measure

$$D_{+}^{st}(Q,P)=\sup\left\{\left(Q(B)-P(B)\right)_{+}; B\in\mathcal{I}(E)\right\}.$$

of Example 1.

### Proposition 2

(Extension and restriction of discrete hemi-metrics)

• If D+ is a risk excess measure on $$\left (\mathcal {M}^{1}(E),\preceq \right)$$, then

$$d_{+}^{r}(x,y):=D_{+}(\delta_{x},\delta_{y})$$

defines a one-sided hemi-metric on (E,≤ r ), called the restriction of D+ on E.

• If $$d_{+}^{\le }$$ is the discrete hemi-metric on (E,≤) of (12), then $$D_{+}^{st}$$ is an extension of $$d_{+}^{\le }$$ into a risk excess measure on (M1(E), st ) such that the restriction $$d_{+}^{r}$$ of $$D_{+}^{st}$$ is equal to $$d_{+}^{\le }$$.

### Proof

• The proof follows by direct verification and Proposition 1.

• The restriction of $$D_{+}^{st}$$ to E writes

$$d_{+}^{r}(x,y):=D_{+}^{st}(\delta_{x},\delta_{y})=\sup\{\left(1_{B}(x)-1_{B}(y)\right)_{+}; B\in\mathcal{I}(E)\},$$

which is {0,1}−valued and a one-sided hemi-metric on E by Proposition 2 part 1. By Proposition 1 part 2,

$$d_{+}^{r}(x,y)=0\Leftrightarrow x\le_{r} y \Leftrightarrow x\le y.$$

Therefore, $$d_{+}^{r}(x,y)=1_{x\nleq y}=d^{\le }_{+}(x,y).$$

### Remark 4

The construction of the previous proposition, based on the $$D_{+}^{st}$$ of Example 1, which encodes the order ≤ into st , is consistent w.r.t. the order ≤, in the sense that the restriction of $$D_{+}^{st}$$ is the discrete one-sided hemi-metric $$d_{+}^{r}=d^{\le }_{+}$$, which encodes the original order ≤. However, for a one-sided hemi-metric d+ on (E,≤) different from the discrete one, the extention $$D_{+}^{st}$$ is in general inconsistent w.r.t. the hemi-metric d+, in the sense that the restriction of the risk excess measure $$D_{+}^{st}$$ is not the original d+ but is again the discrete one-sided hemi-metric $$d_{+}^{\le }$$. This is illustrated in the following diagram: The question of consistently extending/restricting a one-sided hemi-metric d+ into a risk excess measure D+, according to the diagram, will be treated by mass transportation in Section 4.

It is interesting to observe that, in general, there may exist many extensions of a one-sided hemi-metric on E to a risk excess measure on $$\mathcal {M}^{1}(E)$$, as seen in the following example. We will discuss some general extensions in Section 4.

### Example 7

(Positive orthant ordering) On $$E=\mathbb {R}^{d}$$, consider the class $$\mathcal {F}_{uo}$$ of upper orthant indicators,

$$\mathcal{F}_{uo}:= \left\{1_{[\mathbf{z},\infty)}, \mathbf{z}\in\mathbb{R}^{d}\right\}= \left\{1_{\{\mathbf{z}\}^{\uparrow}}, \mathbf{z}\in \mathbb{R}^{d}\right\}.$$

$$\mathcal {F}_{uo}$$ induces on $$\mathcal {M}^{1}(E)$$ the upper orthant ordering uo defined by

$$Q\preceq_{uo} P \Leftrightarrow \overline{F}(\mathbf{z})\le \overline{G}(\mathbf{z}), \forall \mathbf{z}\in\mathbb{R}^{d},$$

where $$\overline {F}(\mathbf {z})=Q([\mathbf {z},\infty))$$ and $$\overline {G}(\mathbf {z})=P([\mathbf {z},\infty))$$ stand for the survival functions of Q and P. So it will be easier for Q to be less risky than P for this order than for the stochastic order, where the comparison has to be made for all increasing sets. The $$\mathcal {F}_{uo}$$-induced risk excess measure $$D_{+}^{\mathcal {F}_{uo}}$$ is given by

$$D_{+}^{uo}(Q,P):=D^{\mathcal{F}_{uo}}_{+}(Q,P)=\sup_{\mathbf{z}\in\mathbb{R}^{d}}(\overline F(\mathbf{z})-\overline G(\mathbf{z}))_{+}.$$

Note that the restriction ≤ uo on $$E=\mathbb {R}^{d}$$ of the partial order uo in the sense of Proposition 1 is identical to the usual componentwise ordering, i.e., ≤ uo =≤. The restriction $$d^{uo}_{+}$$ of the risk excess measure $$D_{+}^{{uo}}$$ in the sense of Proposition 2 is the discrete one-sided hemi-metric $$d_{+}^{\le }$$ (see Example 3 and (12)):

$$d_{+}^{uo}(\mathbf{x},\mathbf{y}):=D_{+}^{uo}(\delta_{\mathbf{x}},\delta_{\mathbf{y}}) =\left\{\begin{array}{ll} 0 &\text{if}~ \mathbf{x\le y}\\ 1 &\text{if} ~\mathbf{x\nleq y} \end{array}\right.=d^{\le}_{+}(\mathbf{x},\mathbf{y}).$$

As a consequence, both risk excess measures $$D_{+}^{{uo}}$$ and $$D_{+}^{{st}}$$ of Example 1 induce the same componentwise ordering ≤ on $$E=\mathbb {R}^{d}$$ and also induce the same restriction as hemi-metric on E. $$D_{+}^{uo}$$ and $$D_{+}^{st}$$ are both extensions of the same discrete one-sided hemi-metric $$d_{+}^{\le }$$ on E from Example 3 (a), as is illustrated in the diagram below: ### Example 8

(Increasing convex ordering) On $$E=\mathbb {R}$$, consider the class of excess functions $$\mathcal {F}_{icx}:=\{\pi _{t}, t\in \mathbb {R}\}$$, with π t (x):=(xt)+. Then, on the class of distributions $$\mathcal {M}^{1}_{1}$$ with finite first moment, the induced ordering $$\preceq _{\mathcal {F}_{icx}}$$ is identical to the increasing convex order,

$$\preceq_{\mathcal{F}_{icx}}=\preceq_{icx}.$$

For XQ and YP in $$\mathcal {M}^{1}_{1}$$, the generated risk excess measure $$D_{+}^{\mathcal {F}_{icx}}$$ is given by

$$D_{+}^{icx}(Q,P):=D_{+}^{\mathcal{F}_{icx}}(Q,P)=\sup_{t\in\mathbb{R}} \left(\Pi_{X}(t)-\Pi_{Y}(t)\right)_{+},$$
(30)

where Π X (t):=E(Xt)+=Eπ t (X), Π Y (t):=E(Yt)+=Eπ t (Y) are the mean excess functions. $$D_{+}^{icx}$$ measures the risk excess of Q w.r.t. P in terms of the corresponding mean excess functions. When restricted to the class of probability measures with identical first moments, $$\preceq _{\mathcal {F}_{icx}}$$ is also identical to the convex ordering,

$$\preceq_{\mathcal{F}_{icx}}=\preceq_{icx}=\preceq_{cx}.$$

In this example, the restriction $$D_{+}^{icx}$$ of $$D_{+}^{icx}$$ is

$$d_{+}^{icx}(x,y):=D_{+}^{icx}(\delta_{x},\delta_{y})=\sup_{t\in\mathbb{R}} \left(\pi_{t}(x)-\pi_{t}(y)\right)_{+}.$$

On the one hand,

$$\begin{array}{@{}rcl@{}} d_{+}^{icx}(x,y)=0&\Leftrightarrow& \pi_{t}(x)\le\pi_{t}(y), \forall t\in\mathbb{R}\\ &\Leftrightarrow& [x\ge t \Rightarrow y\ge t], \forall t\in\mathbb{R} \\ &\Leftrightarrow& x\le y. \end{array}$$

On the other hand, if x>y, then $$d_{+}^{icx}(x,y)=\sup _{t\in \mathbb {R}} \left (\pi _{t}(x)-\pi _{t}(y)\right)$$. By considering all cases, ty, ytx, and xt, one sees that the supremum takes the value xy. Hence, the restriction $$D_{+}^{icx}$$ of $$D_{+}^{icx}$$ is given by

$$d_{+}^{icx}(x,y)=(x-y)_{+}=d_{+}^{b}(x,y),$$

which is the basic one-sided hemi-metric of (13). ## Risk excess measures for random variables and minimal extension by mass transportation

### Compound risk excess measures

So far we have considered risk excess measures as one-sided hemi-metrics on the space of probability distributions, i.e., as a mapping $$D_{+}:\mathcal {M}\times \mathcal {M}\mapsto [0,\infty ]$$, for $$\mathcal {M}\subset \mathcal {M}^{1}(E)$$, acting on a pair (Q,P) of probability measures on E. Like for risk measures $$\rho :\mathfrak {X}\mapsto \mathbb {R}$$ defined on a space of random variables $$\mathfrak {X}\subset \mathfrak {L}^{0}_{E}=\mathfrak {L}^{0}_{E}(\Omega,\mathcal {A},\mu):=\{X:\Omega \to E\}$$ (see e.g., Föllmer and Schied (2002)), it is natural to define risk excess measures $$D_{+}:\mathfrak {X}\times \mathfrak {X}\mapsto \mathbb {R}$$, also on a space $$\mathfrak {X}$$ of random variables.

This allows to consider the risk of a random element XE as a relative property: there is a joint modeling of the vector $$(X,Y)\in \mathfrak {X}^{2}$$, defined on a common probability space $$(\Omega,\mathcal { A}, \mu)$$, so that the risk of X:ΩE can be considered in relation to the random element Y:ΩE, regarded as a benchmark. In the context of insurance and financial mathematics, Y can stand for the value of an alternative portfolio, of a hedge, of a market indicator, or the wealth of an insurer. For example, an insurer, facing the prospect of losing a claim amount X, may wish to evaluate its perceived risk with respect to its reserve capital Y: the ”risk” X does not have the same potential consequences whether Y is small or large compared to X. In the same vein of reasoning, because of the fluctuating and (usually) inflating nature of fiat money in the post-1973, petro-dollar based, current monetary system, one may be interested in evaluating the value of a financial asset X w.r.t. the price of a commodity Y considered as a standard, like gold or oil, whose supply is limited in essence.

For $$\mathfrak {X}\subset \mathfrak {L}^{0}_{E}=\mathfrak {L}^{0}_{E}(\Omega,\mathcal {A},\mu)$$ a set of random variables on $$(\Omega,\mathcal {A},\mu)$$ with values in (E,≤), we consider the pointwise ordering on $$\mathfrak {X}$$ induced by ≤. We identify random elements in $$\mathfrak {L}^{0}_{E}$$ which are identical a.s. and similarly XY means that XY μ-a.s.

### Definition 5

(Risk excess measure on $$\mathfrak {X}$$) For $$\mathfrak {X}\subset \mathfrak {L}^{0}_{E}$$, a risk excess measure D+ on $$\mathfrak {X}$$ is a one-sided hemi-metric on $$\mathfrak {X}$$.

### Definition 6

(Compound risk excess measure on $$\mathfrak {X}$$) A risk excess measure $$D_{+}^{c}$$ on $$\mathfrak {X}$$ is called a compound risk excess measure on $$\mathfrak {X}$$ if $$D_{+}^{c}(X,Y)$$ depends only on the joint distribution μ(X,Y) of (X,Y).

### Example 9

• An example of a risk excess measure on $$\mathfrak {X}$$ which is not compound is

$$D_{+}(X,Y):=\sup_{\omega\in \Omega}(X(\omega)-Y(\omega))_{+}.$$

However, since random elements in $$\mathfrak {L}^{0}_{E}$$ which are identical μ-a.s are identified, it is natural to consider only compound risk excess measure, e.g., the essential supremum version

$$D_{+}(X,Y):=\text{esssup}_{\mu}(X-Y)_{+}$$

• On $$(\Omega,\mathcal {A},\mu)$$, let $$A_{0}\in \mathcal {A}$$, with 0<μ(A0)<1, be a class of scenarios considered as “low risk”, while its complement A1:=ΩA0 is considered as “high risk”. Then, for some safety coefficient α>1,

$$D_{+}(X,Y):=\text{esssup}_{\mu,A_{0}}(X-Y)_{+}+{\alpha}\, \text{esssup}_{\mu,A_{1}}(X-Y)_{+},$$

with $$\text {esssup}_{\mu,A}(X-Y)_{+}:=\inf \{c\in \mathbb {R};\mu ((X-Y)_{+}\ge c)\cap A)=0\}$$, or

$$D_{+}(X,Y):=\int_{A_{0}}(X-Y)_{+}d\mu+\alpha\int_{A_{1}}(X-Y)_{+}d\mu,$$

define non-compound risk excess measures, which values α times more the risk excess (XY)+ for the high risk scenarios than for the low risk ones.

### Remark 5

• The notation $$D_{+}^{c}$$ in Definition 6 stresses that $$D_{+}^{c}$$ depends on the joint distribution μ(X,Y) and not solely on the marginals μX,μY of (X,Y), as is the case in Definition 3. See also Zolotarev (1997, Rachev (1991) for the similar notion of compound probability metric. For risk measures ρ(X) on $$\mathfrak X$$, there is the analog notion of law-invariant risk measures which depend only on the law μX of the random variable.

• There are two main reasons why compound risk measures on $$\mathfrak {X}$$ are of particular importance. Firstly, they allow to define extensions as excess risk measures $$D_{+}:\mathcal {M}\times \mathcal {M}\to [0,\infty ]$$ on subclasses $$\mathcal {M}\subset \mathcal {M}^{1}(E)$$ defined by the induced set of distributions of elements of $$\mathfrak {X}$$ (see Section 4.3). Secondly, the fact that they depend only on the joint distribution μ(X,Y) induces the possibility of statistical estimation of the risk excess D+(X,Y) by their empirical analogs. This property is most relevant for the application of risk excess measures.

• Like in the case of probability metrics, it is also possible to describe compound risk excess measures formally on the subclass $$\mathcal {M}^{(2)}$$ of bivariate laws μ(X,Y) for $$X,Y\in \mathfrak {X}$$. For details in the case of probability metrics, see Rachev (1991).

### Construction of a compound risk excess measure from a one-sided hemi-metric d+ on E

There is a natural way to construct such a compound risk excess measure on a set $$\mathfrak {X}$$ of r.v. in (E,≤): let d+ be a one-sided hemi-metric on (E,≤), and let $$\mathfrak {X}$$ be the set of random variables X s.t. there exists x,yE s.t. Ed+(X,x)< and Ed+(y,X)<. The notion of excess risk of Y w.r.t. X is measured by d+(X,Y). The latter can be turned into a deterministic value, e.g., by taking its expectation, so that one obtains a hemi-metric on $$\mathfrak {X}$$,

$$D^{c}_{+}(X,Y):={Ed}_{+}(X,Y).$$
(31)

Note that (31) depends only on the joint distribution of (X,Y): it is indeed a compound risk excess measure defined on a space $$\mathfrak {X}$$ of random variables.

Indeed, one has:

### Proposition 3

For any measurable one-sided hemi-metric d+ on (E,≤), (31) defines a finite one-sided compound risk excess measure on $$\mathfrak {X}$$.

### Proof

For all $$X,Y\in \mathfrak {X}$$, there exists x,yE s.t. Ed+(X,x)< and Ed+(y,Y)<. Hence, by the triangle inequality,

$${Ed}_{+}(X,Y)\le {Ed}_{+}(X,x)+d_{+}(x,y)+{Ed}_{+}(y,Y)<\infty.$$

Equation (31) is therefore well defined and is obviously a compound risk excess measure. For the one-sidedness property, XY a.s. d+(X,Y)=0 a.s. $$\Leftrightarrow D_{+}^{c}(X,Y)=0$$ follows from the one-sidedness and non-negativity of d+. □

### Remark 6

Formula (31) gives a natural way to obtain a compound excess risk measure from a one-sided hemi-metric d+ on the ambient space E. Note that not all compound excess risk measures can be written in this form. For example, let (d+,i)iI be a countable family of one-sided hemi-metrics on E, then

$$D_{+}^{c}(X,Y):=\sup_{i\in I} E d_{+,i}(X,Y)$$

defines a compound excess risk measure which can not be written as in (31) for some d+.

### Minimal extension of a compound risk excess measure

A compound risk excess measure $$D_{+}^{c}$$, depending on the joint distribution μ(X,Y), can be turned by mass transportation into a risk excess measure on $$\mathcal {M}^{1}(E)$$, i.e., depending only on the pair of marginals μX,μY, where $$\mathcal {M}^{1}(E)$$ is supplied with the stochastic ordering st consistent with the underlying order ≤ on $$\mathfrak {X}$$.

### Definition 7

Let $$D_{+}^{c}$$ be a compound excess risk excess measure. The minimal extension $$D^{inf}_{+}$$ on $$\mathcal {M}^{1}(E)$$ of $$D_{+}^{c}$$ by mass transportation is given by

$$\begin{array}{@{}rcl@{}} D^{inf}_{+}(Q,P)&:=&\inf_{X,Y\in\mathfrak{X}, X\sim Q, Y\sim P}D^{c}_{+}(X,Y). \end{array}$$
(32)

The fact that $$D_{+}^{inf}$$ is indeed a one-sided risk excess measure on the space of probability measures is given in the following proposition:

### Proposition 4

• If (E,≤) is a Polish space with a closed partial order, and if $$D_{+}^{c}$$ is weakly lower-semicontinuous, in the sense that

$$(X_{n},Y_{n})\stackrel{d}{\to}(X,Y)\Rightarrow D_{+}^{c}(X,Y)\le \liminf D_{+}^{c}(X_{n},Y_{n}),$$
(33)

then $$D^{inf}_{+}$$ is a one-sided risk excess measure on $$(\mathcal {M}^{1}(E),\preceq _{st})$$, where st is the stochastic order.

• If $$D_{+}^{c}(X,Y)={Ed}_{+}(X,Y)$$, as in (31), for d+ a lower semi continuous one-sided hemi-metric on (E,≤), then $$D^{inf}_{+}$$ is a one-sided risk excess measure on $$\left (\mathcal {M}^{1}(E),\preceq _{st}\right)$$.

### Proof

• (A1) is obvious. (A2) follows from the fact that $$D_{+}^{c}$$ satisfies (A2): for XQ, $$0 \le D^{inf}_{+}(Q,Q)\le D^{c}_{+}(X,X)=0$$. Regarding (A3): for $$(\Omega,\mathcal {A},\mu)$$ a non-atomic probability space and E a Polish space, any bivariate measure $$\alpha \in \mathcal {M}^{1}(E^{2})$$ can be obtained as the image measure of μ by some measurable mapping, see e.g., Berkes and Philipp (1979). Therefore, for all ε>0, there exists random variables (X,Y1)α=α QP , where $$\alpha \in \mathcal {M}^{1}\left (E^{2}\right)$$ has marginals Q,P and there exists random variables (Y2,Z)β=β PR with marginals P,R s.t.

$$D_{+}^{inf}(Q,P)+\frac{\epsilon}{2}\ge D_{+}^{c}(X,Y_{1}),\quad \text{and }D_{+}^{inf}(P,R)+\frac{\epsilon}{2}\ge D_{+}^{c}(Y_{2},Z).$$

By the gluing lemma, see e.g., Villani (2003) in p. 208, there exists a trivariate measure γ=γ QPR s.t. its projection on the first two marginals is α and its projection on the last two marginals is β. In addition, γ can be obtained as the image measure of μ for some measurable mapping. In other words, there exists a joint construction of a random vector $$(\tilde X,\tilde Y,\tilde Z)$$ on the probability space $$(\Omega,\mathcal {A},\mu)$$ s.t. $$\mu ^{\tilde X,\tilde Y,\tilde Z}=\gamma$$ and

$$D_{+}^{inf}(Q,P)+\frac{\epsilon}{2}\ge D_{+}^{c}\left(\mu^{\tilde X,\tilde Y}\right),\quad \text{and }D_{+}^{inf}(P,R)+\frac{\epsilon}{2}\ge D_{+}^{c}\left(\mu^{\tilde Y,\tilde Z}\right).$$
(34)

By (A3) for the compound risk excess $$D_{+}^{c}$$,

$$D_{+}^{c}\left(\mu^{\tilde X\tilde Z}\right)\le D_{+}^{c}\left(\mu^{\tilde X\tilde Y}\right)+D_{+}^{c}\left(\mu^{\tilde Y\tilde Z}\right)$$

which gives with (34),

$$D_{+}^{inf}(Q,R)\le D_{+}^{c}\left(\mu^{\tilde X\tilde Z}\right)\le D_{+}^{inf}(Q,P)+D_{+}^{inf}(P,R)+ \epsilon.$$

Letting ε0 gives (A3) for $$D_{+}^{inf}$$.

For the one-sidedness property (A4), if $$D^{inf}_{+}(Q,P)=0$$, then there exists a sequence (X n ,Y n ) of random variables on $$(\Omega,\mathcal {A}, \mu)$$, all with fixed marginals Q,P, s.t. $$D_{+}^{c}(X_{n}, Y_{n})\to 0$$. Since $$\mathcal {M}^{1}(Q,P)$$ the set of probability measures on E×E with marginals Q,P is weakly compact in $$\mathcal {M}^{1}\left (E^{2}\right)$$, one can extract a subsequence n s.t. $$\phantom {\dot {i}\!}(X_{n'},Y_{n'})\stackrel {d}{\to }(X,Y)$$ for some (X,Y) with marginals Q,P. By the assumption on $$D_{+}^{c}$$,

$$D_{+}^{c}(X,Y) \le \liminf D_{+}^{c}(X_{n},Y_{n})=0$$

which entails XY, μ-a.s. by (A4’). The latter is equivalent to Q st P by Strassen theorem (see Theorem 1.18 in Rüschendorf (2013)). The converse is obvious.

• If (X n ,Y n )→d(X,Y), by Skorohod’s representation theorem, there exists $$(\tilde X_{n},\tilde Y_{n})\stackrel {a.s.}{\to }(\tilde X,\tilde Y)$$, with $$(\tilde X_{n},\tilde Y_{n})\stackrel {d}{=}(X_{n}, Y_{n})$$, $$(\tilde X,\tilde Y)\stackrel {d}{=}(X, Y)$$. Therefore, lower semi-continuity of d+ and Fatou’s lemma entails,

$$\begin{array}{@{}rcl@{}} D_{+}^{c}(X, Y)&=&{Ed}_{+}(\tilde X,\tilde Y)\le E[\liminf d_{+}(\tilde X_{n},\tilde Y_{n})]\\ &\le& \liminf {Ed}_{+}(\tilde X_{n},\tilde Y_{n}) =\liminf D_{+}^{c}(X_{n},Y_{n}), \end{array}$$

i.e., (33) is satisfied.

### Dual representations of minimal extensions

Define L1:=L1({P,Q}) as the set of functions $$f:E\to \mathbb {R}$$ integrable w.r.t. P and Q, C b as the set of bounded continuous functions $$f:E\to \mathbb {R}$$, and Lip1=Lip1(E,d+) as the set of 1-Lipschitz functions $$f:E\to \mathbb {R}$$ w.r.t. d+, i.e., s.t. for all x,yE,

$$f(y)-f(x)\le d_{+}(y,x)$$

holds. Note that for fLip1(E,d+) and yx, we have f(y)−f(x)≤d+(y,x)=0, i.e., f is increasing w.r.t. the order induced by d+ on E. Hence, Lip1(E,d+) is a subset of the set of increasing functions.

For a compound excess risk measure $$D_{+}^{c}$$ of the kind in (31), the minimal extension $$D_{+}^{inf}$$ on $$\mathcal {M}^{1}(E)$$ of $$D_{+}^{c}$$ by mass transportation, as in (32), admits a representation as a $$\mathcal {F}$$-induced risk excess measure, as in (26), which is given by the following Kantorovich–Rubinstein-type theorem for hemi-metrics:

### Theorem 1

(Kantorovich–Rubinstein theorem for minimal risk excess measure) On a Polish space E, supplied with a closed order ≤, and a lower semi-continuous one-sided hemi-metric d+, the minimal extension $$D_{+}^{inf}$$ of the compound risk excess measure $$D^{c}_{+}(X,Y)={Ed}_{+}(X,Y)$$ has the dual form

$$\begin{array}{@{}rcl@{}} D^{inf}_{+}(Q,P)&=&\sup_{f\in Lip^{1}\cap L^{1}} \left(\int fd(Q-P)\right)_{+} \\ &=&\sup_{f\in Lip^{1}\cap C_{b}} \left(\int fd(Q-P)\right)_{+}. \end{array}$$
(35)

In other words, $$D^{inf}_{+}$$ is identical to a $$\mathcal {F}$$-induced risk excess measure $$D_{+}^{\mathcal {F}}$$of (26), with $$\mathcal {F}=Lip^{1}_{b}$$, the class of bounded Lipschitz functions w.r.t. d+.

### Proof

The proof is similar to the method used to prove the Kantorovich–Rubinstein theorem for metric spaces, see e.g., Rachev and Rüschendorf (1998),Villani (2003), with some slight modifications. Let $$\mathcal {M}^{1}(Q,P)$$ be the set of probability measures π on E×E with marginals Q,P. For (f,g)L1(QL1(P), set

$$J(f,g):=\int fdQ+\int gdP.$$

Let

$$\Phi_{d_{+}}:=\left\{(f,g) \in L_{1}(Q)\times L_{1}(P) ; f(x)+g(y)\le d_{+}(x,y) \text{, for all } x,y\in E\right\},$$

and $$\mathcal {C}_{b}^{2}$$ be the set of pairs of real-valued functions (f,g) which are continuous and bounded. Set

$$S(Q,P):=\sup_{\Phi_{d_{+}}}J(f,g).$$
(36)
• Step one: One has the easy inequality,

$$D_{+}^{Lip^{1}\cap L^{1}}(Q,P)\le D_{+}^{inf}(Q,P).$$
(37)

Indeed, for all fLip1(d+)∩L1 and $$\pi \in \mathcal {M}(Q,P)$$,

$$\begin{array}{@{}rcl@{}} \left(\int f(x)Q(dx)-\int f(y)P(dy)\right)_{+}&=&\left(\int (f(x)-f(y))\pi(dx,dy)\right)_{+}\\ &\le& \int d_{+}(x,y) \pi(dx,dy). \end{array}$$

Taking the inf on the right and the sup on the left entails the stated inequality (37).

• Step two: Kantorovich’s duality, $$D^{inf}_{+}(Q,P)= S(Q,P)=\sup _{\Phi _{d_{+}}}J(f,g)$$.

Since d+≥0 is l.s.c., this follows from Rachev and Rüschendorf (1998) in Theorem 2.3.1 (b) or Villani (2003) in Theorem 1.3.

• Step three: in view of the first two steps, it remains to show that

$$D_{+}^{Lip^{1}\cap L_{1}(Q)}(Q,P)\ge D_{+}^{inf}(Q,P),\vspace*{2pt}$$

i.e., that

$$\sup_{f\in Lip^{1}\cap L_{1}(Q)} \left(\int fd(Q-P)\right)_{+}\ge \sup_{\Phi_{d_{+}}}J(f,g).$$

Assume that d+ is bounded.

For f continuous bounded, define the d+− convex conjugate of f by

$$f^{*}(y):=\inf_{x\in E}\{d_{+}(x,y)-f(x)\}.$$

One obviously has f(x)+f(y)≤d+(x,y), for all x,yE. Therefore, if xd+(x,y) is bounded l.s.c. and fC b , then f is well defined and bounded.

Moreover, by the triangle inequality, one also has

$$d_{+}(x,y)-f(x)\le d_{+}(x,y')+d_{+}(y',y)-f(x).$$

Taking the infimum on x on both sides yields

$$f^{*}(y)-f^{*}(y')\le d_{+}(y',y)=d_{-}(y,y'),$$

where d is the opposite dual hemi-metric defined in (8): f is d-Lipschitz.

Note that if f(x)+g(y)≤d+(x,y) for all x,y, then f(y)≥g(y).

Define the double conjugate by

$$\begin{array}{@{}rcl@{}} f^{**}(x)&:=&\inf_{y\in E}\{d_{+}(x,y)-f^{*}(y)\}. \end{array}$$

One has f(x)≥f(x): by definition,

$$\begin{array}{@{}rcl@{}} f^{**}(x)&=&\inf_{y\in E}\sup_{x'}\left\{d_{+}(x,y)-d_{+}(x',y)+f(x')\right\}\\ &\ge& f(x), \end{array}$$

by taking x=x in the last equation.

Moreover, f is this time d+-Lipschitz: the triangle inequality d+(x,y)−f(y)≤d+(x,x)+d+(x,y)−f(y) yields, by taking the infimum on y, f(x)−f(x)≤d+(x,x).

We obtain: f(x)= infy{d+(x,y)−f(y)}≤−f(x) by taking y=x. On the other hand, since f is 1-Lipschitz w.r.t. d, one has

$$^{*}(x)\le d_{+}(x,y)-{f*}(y),$$

which yields −f(x)≤f(x). Hence, f=−f.

Denoting ϕ:=−f, and since f is d-Lipschitz, ϕ is d+-Lipschitz (and bounded thus integrable). In view of all of the above, $$(f,g)\in \Phi _{d_{+}}\cap \mathcal {C}_{b}^{2}$$ implies $$(f^{**},f^{*})\in \Phi _{d_{+}}$$ and J(f,g)≤J(f,f)=J(ϕ,−ϕ). Hence,

$$\sup_{\Phi_{d_{+}}\cap \mathcal{C}_{b}^{2}} J(f,g)\le \sup_{\phi\in Lip^{1}\cap L_{1}(Q)} J(\phi,-\phi)\le \sup_{\phi\in Lip^{1}\cap L_{1}(Q)} \left(\int \phi d(Q-P)\right)_{+},$$
(38)

which had to be proved.

Combining (37) with (38), yields the desired result for the case of a bounded hemi-metric d+.

• Step 4: One can remove the assumption that d+ is bounded. For d+ a general l.s.c. hemi-metric, one can reason as in Villani (2003) in Theorem 1.3, step 3 with $$d^{n}_{+}=d_{+}/(1+n^{-1}d_{+})$$, so that $$0\le d_{+}^{n}\le d_{+}$$ and $$d^{n}_{+}\uparrow d_{+}$$ pointwise.

### Remark 7

The dual formulation of Theorem 1 gives another proof of the second part of Proposition 4, since the set of increasing bounded Lipschitz functions generates the stochastic order (see the argument in Example 8).

### Examples of minimal risk excess measures

The following propositions give explicit representations of the minimal risk excess measure for several hemi-metrics. We first consider the discrete hemi-metric $$d_{+}^{\le }$$:

### Proposition 5

(Minimal risk excess measure arising from the stochastic order)

• Let $$E=\mathbb {R}^{d}$$ be supplied with the (closed) component-wise order ≤. The discrete hemi-metric $$d_{+}^{\le }$$ of (12) generates, via Proposition 3, the compound risk excess measure

$$D_{+}^{c}(X, Y)=\mu(X\nleq Y).$$
(39)

This induces, as minimal extension by mass transportation on $$\mathcal {M}^{1}(\mathbb {R}^{d})$$, the stochastic ordering one-sided risk excess measure of (10):

$$D_{+}^{inf}(Q,P)=D_{+}^{st}(Q,P).$$
(40)
• A dual representation of (40) is given by

$$D_{+}^{inf}(Q,P)=\sup_{f\uparrow, 0\le f\le 1}\left(\int f d(Q-P)\right)_{+}.$$
(41)

### Proof

• Since ≤ is a closed order, $$C:=\{(x,y)\in E\times E, x\nleq y\}$$ is an open set and $$d_{+}^{\le }(x,y)=1_{C}(x,y)$$ is a {0,1}-valued l.s.c. function. By Kellerer (1984) and Rüschendorf (1986) in Lemma 1, (see also Villani (2003)) in Theorem 1.27,

$$D_{+}^{inf}(Q,P)=\sup\left\{Q(A)-P\left(A^{C}\right), A\subset E, A \text{ closed}\right\},$$

where AC:={yE,xA,(x,y)C}={yE,xA,xy}=A. Since AA,

$$\begin{array}{@{}rcl@{}} D_{+}^{inf}(Q,P)&=&\sup\left\{Q(A)-P\left(A^{\uparrow}\right), A\subset E, A \text{ closed}\right\}\\ &=&\sup\left\{(Q(A)-P(A))_{+}, A\in \mathcal{I}(E), A \text{ closed}\right\}=D_{+}^{st}(Q,P).\vspace*{2pt} \end{array}$$
• By Kantorovich–Rubinstein Theorem 1,

$$\begin{array}{@{}rcl@{}} D_{+}^{inf}(Q,P)&=&\sup_{f\in Lip^{1}(\mathbb{R}^{d},d_{+})}\left(\int f d(Q-P)\right)_{+}\\ &=&\sup_{f\uparrow, 0\le f\le 1}\left(\int f d(Q-P)\right)_{+}. \end{array}$$
(42)

Note that one can restrict to the set of increasing functions such that 0≤f≤1 by shifting the function by a constant.

Next, we consider, for $$E=\mathbb {R}$$, the basic one-sided hemi-metric $$d_{+}^{b}(x,y)=(x-y)_{+}$$, introduced in (13), describing the magnitude of one-sided departure in a quantitative way. For $$\mathfrak {X}=L^{1}(\mu)$$ the set of random variables on $$(\Omega,\mathcal {A},\mu)$$ with finite first moment, d+ induces the compound one-sided risk excess measure

$$D^{c}_{+}(X,Y)={Ed}_{+}^{b}(X,Y)=E(X-Y)_{+}$$
(43)

on $$\mathfrak {X}$$. The corresponding minimal risk excess is given in the following result:

### Proposition 6

(Minimal risk excess arising from mean exceedance)

• The minimal extension of (43) to a risk excess measure on $$\mathcal {M}^{1}(\mathbb {R})$$ by mass transportation is given by

$$\begin{array}{@{}rcl@{}} D_{+}^{inf}(Q,P)&=&\inf_{X\sim Q, Y\sim P}E(X-Y)_{+}\\ &=&\sup_{f\in Lip^{1}, f\uparrow} \left(\int fd(Q-P)\right)_{+}=D_{+}^{Lip^{1,\uparrow}}(Q,P), \end{array}$$

where Lip1, the class of increasing, 1-Lipschitz functions (w.r.t. |.|).

The ordering induced by $$D_{+}^{inf}$$ on $$\mathcal {M}^{1}(\mathbb {R})$$ is the stochastic order st .

• One has the following explicit representation:

$$D_{+}^{inf}(Q,P)=E\left(F^{-1}(U)-G^{-1}(U)\right)_{+},$$
(44)

where F,G are the distribution functions of Q,P, and UU[0,1] is uniformly distributed on [0,1].

### Proof

• With the assumption on $$\mathfrak {X}$$, Kantorovich–Rubinstein Theorem 1 specializes to

$$\begin{array}{@{}rcl@{}} D_{+}^{inf}(Q,P) &=&\sup_{f\in Lip^{1}\left(\mathbb{R}, d_{+}^{b}\right)} \left(\int fd(Q-P)\right)_{+}. \end{array}$$
(45)

Note that $$f\in Lip^{1}\left (\mathbb {R},d_{+}^{b}\right)$$ is equivalent to f(y)−f(x)≤(yx)+, i.e., f increasing and 1-Lipschitz w.r.t. the absolute value |.| norm.

The fact that the order induced by $$D_{+}^{inf}$$ on $$\mathcal {M}^{1}(\mathbb {R})$$ is the stochastic order st follows from Proposition 4. Alternatively, a direct proof is as follows: let n≥1 be a positive integer, XQ, YP. By Markov’s inequality,

$$P(X-Y\ge n^{-1})\le P\left((X-Y)_{+}\ge n^{-1}\right)\le nE[(X-Y)_{+}].$$

Taking the infimum over XQ,YP yields that $$D_{+}^{inf}(Q,P)=0$$ implies that XY<n−1 with probability one. Letting n yields XY a.s. Hence,

$$D_{+}^{inf}(Q,P)=0 \text{ iff there exists }X\sim Q, Y\sim P \text{ s.t.} X\le Y \text{ a.s.}$$

and the latter is equivalent to Q st P, by Strassen theorem.

• f(x)=x+ is convex, hence f(xy) is submodular (or quasi-antitone in the terminology of Cambanis et al. (1976), or supernegative or 2-negative in the terminology of Tchen (1980)). This implies (44) by results of Cambanis et al. (1976) in Theorem 2, or Tchen (1980) in Corollary 2.3 (see also Rüschendorf (2013)).

### Remark 8

(Comparison with the stop-loss metric) Note that for $$t\in \mathbb {R}$$, the compound one-sided risk excess measure $$D^{c}_{+}(X,t)=E(X-t)_{+}=\Pi _{X}(t)$$ is the average risk excess over the threshold t, which stands for the stop-loss premium of a reinsurer in insurance theory. Rachev and Rüschendorf (1990) consider the stop loss metric as the difference of two stop loss premiums, which would write with our conventions of notations (see Eq. (2.2) in Rachev and Rüschendorf (1990)) as,

$$D^{s}(X,Y)=\sup_{t\in\mathbb{R}}|\Pi_{X}(t)-\Pi_{Y}(t))|.$$

One could obtain from it the corresponding hemi-metric which was introduced in (30), in relation to the increasing convex order,

$$D_{+}^{s}(X,Y)=\sup_{t\in\mathbb{R}}(\Pi_{X}(t)-\Pi_{Y}(t)))_{+},$$

which is distinct from the minimal risk excess $$D_{+}^{inf}$$. This follows from the triangle inequality for (Xt)+:

$$(X-t)_{+}-(Y-t)_{+}\le (X-Y)_{+}$$

and taking the infimum yields that

$$D_{+}^{s}(X,Y)\le D_{+}^{inf}(Q,P).$$

In other words, the hemi-metric obtained by a one-sided comparison of risks through their stop-loss premiums is always majorized by the minimal risk excess. See also remark 9 for similar considerations for the tail risk.

In risk theory, it is also of interest to compare the expected risks above their distributional α-quantiles: this is the basis for the conditional tail expectation

$${CTE}_{\alpha}(X):=E[X|X\ge q_{\alpha}(X)],\quad {CTE}_{\alpha}(Y):=E[Y|Y\ge q_{\alpha}(Y)],$$

where q α (X), q α (Y) denote the corresponding α−quantiles of XQ with c.d.f. F, YP, with c.d.f. G. In order to obtain a coherent risk measure and to generalize to possibly non-continuous distributions (see Burgert and Rüschendorf (2006)), it is useful to instead consider the expected shortfall. Define, for λ[0,1], the extended c.d.f.s of F, G as

$$\begin{array}{@{}rcl@{}} F(x,\lambda)&:=&P(X<x)+\lambda P(X=x)=F(x-)+\lambda(F(x)-F(x-))\\ G(y,\lambda)&:=&P(Y<y)+\lambda P(Y=y)=G(y-)+\lambda(G(y)-G(y-)). \end{array}$$

Define also the distributional transforms of X and Y as

$$U_{1}:=F(X,V), \quad U_{2}:=G(Y,V),$$
(46)

where VU(0,1) is independent of (X,Y), see Rüschendorf (2009). The expected shortfalls are then defined as ES α (X):=E[X|U1α], respectively as ES α (Y):=E[Y|U2α].

For the one-sided comparison of the risk excess of X w.r.t. Y over their α-quantiles, we therefore consider the excess risk of their expected shortfall defined by the following one-sided compound risk excess measure $$D_{+}^{\alpha,c} (X,Y)$$

$$D_{+}^{\alpha,c} (X,Y)=E\left(X1_{U_{1}\ge \alpha}-Y1_{U_{2}\ge \alpha}\right)_{+},$$
(47)

where U1, U2 are as in (46). We obtain the following result:

### Proposition 7

(Minimal tail risk excess)

• The minimal extension of (47) to a risk excess measure on $$\mathcal {M}^{1}(\mathbb {R})$$ by mass transportation has the representation

$$\begin{array}{@{}rcl@{}} D_{+}^{\alpha,inf}(Q,P)&:=&\inf_{X\sim Q, Y\sim P} {ED}_{+}^{\alpha,c}(X,Y)\\ &=&E\left[\left(F^{-1}(U)-G^{-1}(U)\right)_{+}1_{U\ge \alpha}\right], \end{array}$$
(48)

where UU[0,1] is uniformly distributed on [0,1].

• The ordering α induced by $$D_{+}^{\alpha,inf}$$ is given by

$$Q\preceq_{\alpha} P \Leftrightarrow F^{-1}(u)\le G^{-1}(u) \quad \forall u\ge \alpha,$$

which corresponds to the classical stochastic order restricted to the upper tail.

### Proof

• Denote by F α the law of $$\phantom {\dot {i}\!}X_{\alpha }:=X1_{U_{1}\ge \alpha }=X1_{F(X,V)\ge \alpha }$$ and by G α the law of $$\phantom {\dot {i}\!}Y_{\alpha }:=Y1_{U_{2}\ge \alpha }=Y1_{G(Y,V)\ge \alpha }$$. Then,

$$D_{+}^{\alpha,inf}(Q,P)=\inf_{X_{\alpha}\sim F_{\alpha}, Y_{\alpha}\sim G_{\alpha}}. E(X_{\alpha}-Y_{\alpha})_{+}$$

Since $$X_{\alpha }=F^{-1}(U_{1})1_{U_{1}\ge \alpha }\phantom {\dot {i}\!}$$ with U1U[0,1], F α is the image of the Lebesgue measure on [0,1] induced by the transformation uF−1(u)1uα. Similarly, G α is the image of the Lebesgue measure on [0,1] induced by the transformation uF−1(u)1uα. Therefore, for UU(0,1), the comonotone pair of random variables $$\tilde X_{\alpha }=F^{-1}(U)1_{U\ge \alpha }$$ and $$\tilde Y_{\alpha }=G^{-1}(U)1_{U\ge \alpha }$$ is admissible for (F α ,G α ).

By submodularity, as in Proposition 6,

$$E(X_{\alpha}-Y_{\alpha})_{+}\ge E\left[\left(F^{-1}(U)-G^{-1}(U)\right)_{+}1_{U\ge \alpha}\right],$$

which implies the result.

• Follows from (48).

### Remark 9

It is interesting to note that the expected shortfall of X is given by

$${ES}_{\alpha}(X)=\frac{1}{1-\alpha}E\left[F^{-1}(U)1_{U\ge \alpha}\right].$$

As expected, the minimal extension risk excess measure dominates the normalized one-sided difference of expected shortfalls:

$$D_{+}^{\alpha,inf}(Q,P)\ge (1-\alpha)\left({ES}_{\alpha}(X)-{ES}_{\alpha}(Y)\right)_{+},$$

where YP,XQ.

## Weak risk excess measures

### Motivation and definition

In view of the mass transportation approach of (32), one may inquire whether there exist other schemes of obtaining a risk excess measure D+(Q,P), in the sense of Definition 3, from a compound risk excess measure $$D^{c}_{+}(X,Y)$$, in the sense of Definition 6. In particular, it is natural to investigate the following “maximal extension” in the sense of mass transportation,

$$D_{+}^{sup}(Q,P):=\sup_{X,Y\in\mathfrak{X}, X\sim Q, Y\sim P}D^{c}_{+}(X,Y).$$
(49)

Obviously, $$D_{+}^{inf}(Q,P)\le D_{+}^{sup}(Q,P)$$.

However, $$D_{+}^{sup}$$ is not a risk excess measure: although (A1) and (A3) are obviously satisfied, (A2) is not. Indeed,

$$D^{sup}_{+}(Q,Q)=0 \Leftrightarrow X\sim Q, Y\sim Q \text{ implies } D_{+}^{c}(X,Y)=0.$$

This implies that XY a.s. for all possible realizations XQ,YQ. But for X,Y independent with the same law Q, this would require that XY a.s. which is only true for Q being a one-point distribution. These considerations imply that $$D_{+}^{sup}$$ can not be compatible with a reflexive order relation: axiom (A4) can not be satisfied either.

Nonetheless, $$D_{+}^{sup}$$, as a supremum over all joint constructions of (X,Y)(Q,P), gives the best possible upper bound on the compound risk excess measure in the sense of mass transportation,

$$D_{+}^{c}(X,Y)\le D_{+}^{sup}(Q,P),$$

and therefore has a natural interpretation as a worst-case comparison, which is appealing for risk applications.

These considerations motivate the introduction of a weakened notion of risk excess measure, without axiom (A2) and with axiom (A4) restricted to a strict order , i.e., a transitive and irreflexive relation. Therefore, we propose the following definitions:

### Definition 8

(Weak risk excess measure) Let be a strict order on $$\mathcal {M}^{1}(E)$$. A one-sided weak risk excess measure $$D_{+}^{w}$$ on $$\left (\mathcal {M}^{1}(E),\prec \right)$$ is an application $$D_{+}^{w}:\mathcal {M}^{1}(E)\times \mathcal {M}^{1}(E)\to \overline {\mathbb {R}}$$ which satisfies axioms (A1), (A3), and (A4).

### Definition 9

(Maximal extension) Let $$D^{c}_{+}$$ be a compound excess risk measure. The maximal extension $$D^{sup}_{+}$$ on $$\mathcal {M}^{1}(E)$$ of $$D_{+}^{c}$$ by mass transportation is given by (49).

### Remark 10

• The concept of one-sided weak risk excess measure is an asymmetric analog of the concept of moment function in the theory of probability metrics, see Rachev (1991) in Chap. 3.3, or Rachev et al. (2013) in Chapters 3.4. and 8.2. In addition, the adjunction of axiom (A4) makes it compatible with a notion of order. Obviously, a one-sided risk excess measure for a preorder is a one-sided weak risk excess measure for the strict order defined by

$$P\prec Q \Leftrightarrow P\preceq Q \quad \text{and} P\neq Q.$$
• The relation between the minimal $$D_{+}^{inf}$$ and maximal $$D_{+}^{sup}$$ extensions obtained from a compound risk excess measure $$D_{+}^{c}$$, is given in the following improved triangle inequality:

$$D_{+}^{sup}(Q,R)\le D_{+}^{inf}(Q,P)+D_{+}^{sup}(P,R),$$

where P,Q,R are three probability measures on E, see Rachev et al. (2013) in Theorem 3.4.1.

Define on $$\mathcal {M}^{1}(E)$$ the following strict order sup by

$$Q\prec_{sup} P \Leftrightarrow \sup (supp(Q)) \le \inf(supp(P)),$$
(50)

where supp(.) denotes the support of a distribution. The analog of Proposition 4 for the maximal extension, which shows that $$D_{+}^{sup}$$ is indeed a one-sided weak risk excess measure, is given in the following proposition:

### Proposition 8

$$D_{+}^{sup}$$ obtained in (49) from a compound excess risk measure $$D^{c}_{+}(X,Y)={Ed}_{+}(X,Y)$$ of the form (31) is a one-sided weak risk excess measure on $$(\mathcal {M}^{1}(E),\prec _{sup})$$.

### Proof

(A1) and (A3) are trivially satisfied. For (A4), if $$D_{+}^{sup}(Q,P)=0$$, then for all XQ,YP, Ed+(X,Y)=0. Markov’s inequality entails that for all ε>0, d+(X,Y)≤ε a.s. Hence, d+(X,Y)=0 a.s., i.e XY a.s. for all XQ,YP. This can only hold if the support of Q is completely to the left of the support of P. The converse direction is trivial: if Q sup P, then for all couplings XQ, YP, XY a.s., and thus supXQ,YPEd+(X,Y)=0. □

### Dual representation of maximal one-sided weak risk excess measure

A dual representation of the maximal one-sided weak risk excess measure $$D_{+}^{sup}$$ associated with the compound risk excess measure $$D^{c}_{+}(X,Y)={Ed}_{+}(X,Y)$$ of the form in (31) is given in the following theorem:

### Theorem 2

(Dual Representation) Let E be a Polish space, supplied with the one-sided hemi-metric d+, and let $$D_{+}^{c}(X,Y)={Ed}_{+}(X,Y)$$ be the corresponding compound excess risk measure,

• if d+ is upper or lower semi-continuous, then duality holds:

$$D_{+}^{sup}(Q,P)=\inf_{\Psi_{d+}}\left\{ \int f dQ+\int gdP \right\},$$

where

$$\begin{array}{@{}rcl@{}} \Psi_{d_{+}}:=&\{&(f,g)\in Lip^{1}(d_{+})\times Lip^{1}(d_{-}), f(x)\ge 0, g(y)\ge 0,\\&& f(x)+g(y)\ge d_{+}(x,y), (x,y)\in E^{2} \}. \end{array}$$
• if d+ is upper semi-continuous, then the supremum is attained for some probability measure.

### Proof

• Since a lower or upper semi-continuous function is a supremum or infimum of continuous functions, d+ is a Baire function. Hence, the duality Theorem 2.3.8 (a) in Rachev and Rüschendorf (1998) applies, since d+≥0 is obviously majorized from below (i.e., belongs to $$\mathcal P_{m}(S)$$ in the notation of Theorem 2.3.8 in Rachev and Rüschendorf (1998)). Therefore, Theorem 2.3.8 (a) entails

$$\sup\left\{\int d_{+}(x,y)\mu(dx,dy)\right\}=\inf\{\int fdQ+\int gdP \},$$
(51)

where the infimum on the right side is taken in

$$\Psi_{1}:=\{f\in L_{1}(Q), g\in L_{1}(P), d_{+}(x,y)\le f(x)+g(y), (x,y)\in E^{2}\}.$$

Let γ1,γ2 two real-valued constants s.t. γ1+γ2=0 and set for (f,g)Ψ1, $$(\tilde f:=f-\gamma _{1},\tilde g:=g-\gamma _{2})$$. Then, $$(\tilde f, \tilde g)\in \Psi _{1}$$ and $$J(f,g)=\int fdQ+\int gdP$$ remains invariant when one replaces (f,g) by $$(\tilde f, \tilde g)$$, i.e., $$J(f,g)=J(\tilde f,\tilde g)$$. Therefore, if f takes some negative values, then, setting γ1= inff(x) entails $$\tilde f\ge 0$$ and the infimum in (51) can be restricted to

$${} \Psi_{2}:=\{f\in L_{1}(Q), g\in L_{1}(P), f(x)\ge 0, d_{+}(x,y)\le f(x)+g(y), (x,y)\in E^{2}\}.$$

By symmetry, the infimum in (51) can further be restricted to

$${}\Psi_{3}:= \!\{\!f \!\in\! L_{1}\! (Q), g \!\in\! L_{1}(\! P), f\! (x)\! \ge\! 0, g\! (y)\! \ge\! 0, d_{+}\! (x,y \!)\! \le \!f\! (x)+g (y),\! (x,y)\! \in\! E^{2}\}.$$

Assume d+ is upper bounded. For (f,g)Ψ3, set f(y):= supx(d+(x,y)−f(x)) and f(x):= supy(d+(x,y)−f(y)). Then, (f,f)Ψ1, gf, ff. Hence, J(f,g)≥J(f,f). Moreover, by the triangle inequality,

$$\begin{array}{@{}rcl@{}} d_{+}(x,y)-g^{*}(y)&\le& d_{+}(x,x')+d(x',y)-f(y) \end{array}$$

and taking the supremum in y yields

$$\begin{array}{@{}rcl@{}} f_{**}(x)-f_{**}(x')&\le& d_{+}(x,x'). \end{array}$$

Hence, fLip1(d+), whereas a similarly calculation shows that fLip1(d). Therefore, the infimum in (51) can further be restricted to $$\Psi _{d_{+}}$$, as claimed.

The general case, for d+ unbounded, proceeds by approximation, as in Theorem 1.

• Follows from Theorem 2.3.10 in Rachev and Rüschendorf (1998).

### Examples of maximal extensions

We discuss for some of the examples in Section 4 the corresponding worst-case risk excess $$D_{+}^{sup}$$. First, we consider the discrete one-sided hemi-metric $$d_{+}^{\le }$$ of (12) on $$E=\mathbb {R}^{d}$$, supplied with the product order ≤. The associated compound risk excess measure is given by (39):

$$D_{+}^{c}(X, Y)=\mu(X\nleq Y),$$

for XQ,YP, and its minimal extension (41) coincides with the induced risk excess measure $$D_{+}^{st}$$ (see (10)) compatible with the stochastic order. The maximal extension is given in the following proposition:

### Proposition 9

(Maximal Risk excess for stochastic ordering)

• Let $$D_{+}^{\le,sup}$$ be the one-sided weak risk excess measure on $$(\mathcal {M}^{1}(\mathbb {R}),\prec _{sup})$$ obtained by maximal extension of the discrete compound risk measure $$D_{+}^{c}$$ in (39). $$D_{+}^{\le,sup}$$ has the representation:

$$D_{+}^{\le,sup}(Q,P)=1-\sup_{x\in\mathbb{R}^{d}}(F(x)-G(x)),$$
(52)

where F,G are the c.d.f.s of Q,P, respectively.

• The restriction of $$D_{+}^{\le,\sup }$$ on E, obtained by setting $$d^<_{+}(x,y):= D_{+}^{\le,\sup }(\delta _{x},\delta _{y})$$, defines a weak one-sided hemi-metric compatible with the strict order <, i.e.,

$$d^<_{+}(x,y)=1_{x\ge y},$$

with d+< satisfying axioms (A1), (A3), and (A4) for the strict order < associated with ≤.

### Proof

• Note that by Strassen theorem, (see, e.g., Rachev and Rüschendorf (1998) in Theorems 3.5.1 and 3.5.5 or Rüschendorf (1991) in Theorems 4 and 5),

$$\begin{array}{@{}rcl@{}} D_{+}^{\le,sup}(Q,P)&=&\sup_{X\sim Q, Y\sim P}\mu(X\nleq Y)=1-\inf_{X\sim Q, Y\sim P} \mu(X\le Y)\\ &=&1-\sup(Q(B_{1})+P(B_{2})-1), \end{array}$$

where the supremum is over all pair of subsets B1,B2E s.t. B1×B2B:={(x,y);xy}. But for B1×B2B, it follows that $$B_{1}^{\downarrow }\times B_{2}^{\uparrow }\subset B$$, where $$B_{1}^{\downarrow }=\{x\in \mathbb {R}^{d}:\exists \bar {x}\in B_{1} \text { s.t.} x\le \bar {x}\}$$ and $$B_{2}^{\uparrow }=\{y\in \mathbb {R}^{d}:\exists \bar {y}\in B_{2} \text { s.t.} y\ge \bar {y}\}$$ are the decreasing, resp. increasing, completions of B1,B2. Then, it is easy to see that one can enlarge $$B_{1}^{\downarrow }, B_{2}^{\uparrow }$$ to intervals of the form (−,x], [x,). As a result the maximal extension is given by

$$\begin{array}{@{}rcl@{}} D_{+}^{\le,sup}(Q,P)&=&2-\sup_{x\in\mathbb{R}^{d}}\{F(x)+\overline G(x)\}\\ &=&1-\sup_{x\in\mathbb{R}^{d}}\{F(x)-G(x)\}, \end{array}$$

where $$\overline {G}(x)=P([x,\infty))$$.

• Formula (52) yields

$$D_{+}^{\le,sup}(\delta_{x},\delta_{y})=1-\sup_{z\in\mathbb{R}^{d}}\{1_{z\ge x}-1_{z\ge y}\}=1_{x\ge y}.$$

### Remark 11

Comparing this result with those of Proposition 2 and Example 7, one sees that the discrete one-sided hemi-metric $$d_{+}^{\le }(x,y)=1_{y\nleq x}$$ and the corresponding compound risk excess measure has many extensions on $$\mathcal {M}^{1}(\mathbb {R}^{d})$$ and, in particular, we obtain

$$D_{+}^{uo}\le D_{+}^{st}\le D_{+}^{\le,sup}.$$

The following diagram illustrates the different embeddings of structures, through their hemi-metrics: Next, we investigate the maximal one-sided weak risk excess extension for the basic hemi-metric (13): on $$E=\mathbb {R}$$, for XF,YG, let $$D_{+}^{c}(X,Y)=E(X-Y)_{+}$$ be the average risk excess as in (43). The maximal risk excess extension by mass transportation is given by the following proposition.

### Proposition 10

(Risk excess from exceedance in average) Let $$D_{+}^{b,sup}(Q,P)$$ be the maximal one-sided weak risk excess extension, obtained by mass transportation of the compound risk excess measure $$D_{+}^{c}(X,Y)=E(X-Y)_{+}$$. One has the representation

$$D_{+}^{b,sup}(Q,P)=E\left[\left(F^{-1}(U)-G^{-1}(1-U)\right)_{+} \right],$$
(53)

where F,G are the c.d.f.s of Q,P, respectively.

### Proof

The argument for the maximal risk excess extension is similar to that of the minimal risk excess extension. □

In the previous propositions, the order induced by the maximal extension is very strong. For insurance applications, in particular for comparing tail risk, it is of interest to restrict the comparisons to the upper tails of the distributions, see Proposition 7 in Section 4. Finally, we give the result for the tail excess compound risk measure $$D^{c,\alpha }_{+}(X,Y)$$ in (47), which induces a more interesting order:

### Proposition 11

(Tail risk excess)

• Let 0<α<1, then the maximal extension $$D_{+}^{\alpha,sup}$$ is given by

$$D_{+}^{\alpha,sup}(Q,P)=(1-\alpha)D^{sup}_{+}(Q^{\alpha},P^{\alpha}),$$
(54)

where Qα,Pα are the conditional distributions of Q,P on their upper α-quantiles intervals [q α (Q),),[q α (P),).

• Correspondingly, a suitable consistent ordering α on $$\mathcal {M}^{1}(\mathbb {R})$$ is given by

$$Q\prec_{\alpha} P \Leftrightarrow G^{-1}(u)\le F^{-1}(1-u+\alpha), \text{ for all} \alpha\le u\le 1,$$

where F,G are the c.d.f.s of Q,P. For the maximal extension, the random variables are chosen counter-monotonic in the upper part of the distribution.

### Proof

Similar to the proof of Proposition 10. □

## Extensions with dependence constraints

### Setup

In Sections 4 and 5, we considered risk excess measures D+(Q,P) obtained as minimal and maximal extensions obtained by mass transportation of a compound risk excess measure, i.e., over the class of all dependence structures of (Q,P). In this section, we consider a relevant modification of this method by restricting the class of possible dependence structures. This setup allows to take into consideration some known side information on the dependence structure of (Q,P), like various bounds on positive or negative dependence, see e.g., Rüschendorf (2013) in Chapter 5.

We consider the setup $$E=\mathbb {R}$$ with hemi-metric d+ and the compound excess risk measure $$D_{+}^{c}(X,Y)={Ed}_{+}(X,Y)$$ of the kind (6), where $$X,Y\in \mathfrak {X}$$ have marginals Q,P. If C=CX,Y is a copula of (X,Y), we also write E C d+(X,Y) to stress the dependence on C, and we denote by $$\mathcal {C}$$ the set of all bivariate copula functions. Let $$\mathcal {D}\subset \mathcal {C}$$ denote a subclass of copulas which describe the information on the dependence structure. Then, it is natural to consider the worst and best-case extension of $$D_{+}^{c}$$ over $$\mathcal {D}$$.

### Definition 10

(Minimal and maximal extension with dependence restriction) For a subclass $$\mathcal {D}\subset \mathcal {C}$$:

• The minimal extension with dependence restriction $$\mathcal {D}$$ of $$D_{+}^{c}$$ is defined as

$$D_{+}^{\mathcal{D},inf}(Q,P):=\inf \{E_{C} d_{+}(X,Y), X\sim Q, Y\sim P, C\in\mathcal{D} \}.$$
(55)
• Similarly, the maximal extension with dependence restriction $$\mathcal {D}$$ is defined as

$$D_{+}^{\mathcal{D},sup}(Q,P):=\sup \{E_{C} d_{+}(X,Y),X\sim Q, Y\sim P, C\in\mathcal{D} \}.$$
(56)

In the case without dependence restriction, i.e., when $$\mathcal {D}=\mathcal {C}$$, we get the minimal and maximal extensions $$D_{+}^{inf}$$, $$D_{+}^{sup}$$ of (32) and (49) considered in Sections 4 and 5.

### Remark 12

By the previous discussion of Section 4 (see Proposition 4), it is clear that $$D_{+}^{\mathcal {D},inf}$$ is a risk excess measure on $$\left (\mathcal {M}^{1}(E),\preceq _{st}\right)$$ only in case that $$\mathcal {D}$$ contains the upper Fréchet bound M, defined by M(u,v)= min(u,v),0≤u,v≤1. So typically the restricted extensions will not satisfy the properties (A2) and (A4) of a one-sided risk excess measure on $$\left (\mathcal {M}^{1}(E),\preceq _{st}\right)$$.

Despite that, the extensions (55) and (56) have a natural motivation as best, resp., worst-case excess risk taking into account the dependence restrictions. On the level of random variables, the class of pairs (X,Y) with $$C_{XY}\in \mathcal {D}$$ and XY may be empty even if Q st P. Therefore, the unrestricted extensions $$D_{+}^{inf}$$, resp., $$D_{+}^{sup}$$, would under, resp., over estimate the real risk excess. As a consequence, this is a strong indication for the relevance of the notion of minimal, resp., maximal risk excess with dependence restriction $$\mathcal {D}$$.

### Explicit results for extensions with positive and negative dependence restriction

We now consider two particular classes of dependence restrictions $$\mathcal {D}$$ which allow determination of the minimal, resp., maximal, extensions in explicit form. Denote for copulas $$C_{0}, C_{1}\in \mathcal {C}$$ by

$$\mathcal{D}_{\le}(C_{0}):=\{C\in\mathcal{C}; C\le C_{0}\}$$
(57)

and by

$$\mathcal{D}_{\ge}(C_{1}):=\{C\in\mathcal{C}; C\ge C_{1}\}$$
(58)

the class of all copulas which are smaller than C0, resp., bigger than C1, in the lower orthant ordering lo (equivalently in the upper orthant ordering uo ). (57) describes a negative dependence restriction, (58) a positive dependence restriction: for the case C0=C1=Π, the independence copula Π(u,v)=uv, 0≤u,v≤1, these restrictions correspond to negatively quadrant dependent (NQD), resp., positively quadrant dependent (PQD), random variables, as defined by Lehmann (1966), see Nelsen (2006) in p. 186.

Then, for d+(x,y)=(xy)+, we obtain the following explicit result.

### Proposition 12

(Minimal and maximal risk excess with positive/negative dependence restriction)

• For $$\mathcal {D}=\mathcal {D}_{\le }(C_{0})$$, we obtain the explicit formula for the minimal risk excess extension

$$D_{+}^{\mathcal{D},inf}(Q,P)=E_{C_{0}}\left(X^{0}-Y^{0}\right)_{+},$$
(59)

where X0Q,Y0P and $$C_{X^{0},Y^{0}}=C_{0}\phantom {\dot {i}\!}$$.

• For $$\mathcal {D}=\mathcal {D}_{\ge }(C_{1})$$, we obtain the explicit formula for the maximal risk excess extension

$$D_{+}^{\mathcal{D},sup}(Q,P)=E_{C_{1}}\left(X^{1}-Y^{1}\right)_{+},$$
(60)

where X1Q,Y1P and $$\phantom {\dot {i}\!}C_{X^{1},Y^{1}}=C_{1}$$.

### Proof

• For (X,Y) with XQ,YP and CX,Y=CC0, it follows from the submodularity argument, as in the proof of Proposition 6 that

$$E(X-Y)_{+}\ge E(X^{0}-Y^{0})_{+},$$

since f(xy)=(xy)+ is submodular and (X,Y)≤ sm (X0,Y0), with ≤ sm the supermodular ordering. Taking the infimum yields the result.

• The argument is similar.

### Remark 13

• Taking for $$\mathcal {D}$$ the two-sided dependence information

$$\mathcal{D}=\mathcal{D}(C_{0},C_{1})=\{C\in \mathcal{C}; C_{1}\le C\le C_{0}\},$$

we obtain for $$D_{+}^{\mathcal {D},inf}$$ the same formula as in (59) and for $$D_{+}^{\mathcal {D},sup}$$ the same formula as in (60). Thus, this information simultaneously shrinks the upper and the lower bound for the risk excess.

• The concept of minimal, resp., maximal risk excess can also be introduced for the general case (E,≤) and general compound risk excess measures $$D_{+}^{c}$$. In this case, $$\mathcal {D}$$ denotes a class of dependence structures of random elements X,YE. Even if $$D_{+}^{inf}$$ and $$D_{+}^{sup}$$ do not satisfy on the level of distributions the risk excess measure axioms (A2) and (A4), they describe the relevant bounds for the risk excess with dependence information $$\mathcal {D}$$.

## Conclusion

We proposed a quantitative one-sided comparison of probabilistic risks via the concept of risk excess measures, obtained as order extensions of hemi-metrics on the underlying space E. Like for the case of risk measures, the choice of a suitable hemi-metric and corresponding excess risk measure for a particular application will depend on the problem considered and the notion of order one wants to quantify. For reliability, insurance mathematics, finance, epidemiology, etc... different notions of orders and distances are related to the problem at hand. In this regard, the examples proposed, together with their explicit formulas, are helpful. Together with the extension/restriction properties of Section 3, and the dual representations of Sections 4 and 5, they can serve as a guide for the interpretation of the excess risk measure and coherence w.r.t. order and distance on the ambient space E.

We leaved aside the statistical aspects, but let us just mention that one can obtain empirical versions of the various risk excess measures D+(P,Q) presented here by replacing P,Q in their definitions by the corresponding empirical measures P n ,Q n . For excess risk measures which have an explicit formula, statistical estimation is straightforward by plugging in the empirical measures P n ,Q n instead of P,Q. For the $$\mathcal {F}$$-induced risk excess measures of Section 3, and for risk excess measures obtained by minimal and maximal extensions (Sections 4 and 5) of a compound one, their dual representation as a supremum (or infimum) over a functional class allows to consider their estimation via Glivenko–Cantelli-type theorems indexed by function classes. This is one supplementary interest of these dual formulations. For example, for the $$\mathcal {F}$$-induced risk excess measure of (26), since x+≤|x|, one has obviously that

$$D_{+}^{\mathcal{F}}(Q_{n},P_{n})=\sup_{F\in \mathcal{F}}\left(\int f d(Q_{n}-P_{n})\right)_{+}\le \sup_{F\in \mathcal{F}}\left|\int f d(Q_{n}-P_{n})\right|,$$

i.e., the risk excess measure is majorized by the corresponding integral probability metric and the convergence of the latter follows from classical results on abstract empirical process, see e.g., Sriperumbudur et al. (2012).

## References

• Artzner, P, Delbaen, F, Eber, J-M, Heath, D: Coherent measures of risk. Math. Finance. 9(3), 203–228 (1999). https://doi.org/10.1111/1467-9965.00068.

• Berkes, I, Philipp, W: Approximation theorems for independent and weakly dependent random vectors. Ann. Probab. 7(1), 29–54 (1979).

• Burgert, C, Rüschendorf, L: Consistent risk measures for portfolio vectors. Insurance Math. Econom. 38(2), 289–297 (2006). https://doi.org/10.1016/j.insmatheco.2005.08.008.

• Cambanis, S, Simons, G, Stout, W: Inequalities for E k(X,Y) when the marginals are fixed. Z. Wahrscheinlichkeitstheorie und Verw. Gebiete. 36(4), 285–294 (1976). https://doi.org/10.1007/BF00532695.

• Capéraà, P, Van Cutsem, B: Méthodes et Modèles en Statistique Non Paramétrique. Les Presses de l’Université Laval, Sainte-Foy, QC; Dunod, Paris (1988). Exposé fondamental. [Basic exposition], With a foreword by Capéraà, Van Cutsem and Alain Baille.

• Delbaen, F: Coherent risk measures on general probability spaces. In: Advances in Finance and Stochastics, pp. 1–37. Springer, Berlin (2002).

• Dudley, RM: Distances of probability measures and random variables. Ann. Math. Statist. 39, 1563–1572 (1968). https://doi.org/10.1007/978-1-4419-5821-1_4.

• Dudley, RM: Probabilities and Metrics. Matematisk Institut, Aarhus Universitet, Aarhus (1976). Convergence of laws on metric spaces, with a view to statistical testing, Lecture Notes Series, No. 45.

• Dudley, RM: Real Analysis and Probability. Cambridge Studies in Advanced Mathematics, Vol. 74. Cambridge University Press, Cambridge (2002). https://doi.org/10.1017/CBO9780511755347. Revised reprint of the 1989 original.

• Faugeras, OP, Rüschendorf, L: Markov morphisms: a combined copula and mass transportation approach to multivariate quantiles. Math. Applicanda. 45, 3–45 (2017).

• Föllmer, H, Schied, A: Stochastic Finance. De Gruyter Studies in Mathematics, Vol. 27. Walter de Gruyter & Co., Berlin (2002). https://doi.org/10.1515/9783110198065. An introduction in discrete time.

• Goubault-Larrecq, J: Non-Hausdorff Topology and Domain Theory. New Mathematical Monographs, Vol. 22. Cambridge University Press, Cambridge (2013). https://doi.org/10.1017/CBO9781139524438. [On the cover: Selected topics in point-set topology].

• Jouini, E, Meddeb, M, Touzi, N: Vector-valued coherent risk measures. Finance Stoch. 8(4), 531–552 (2004). https://doi.org/10.1007/s00780-004-0127-6.

• Kellerer, HG: Duality theorems for marginal problems. Z. Wahrsch. Verw. Gebiete. 67(4), 399–432 (1984). https://doi.org/10.1007/BF00532047.

• Koenker, R: Quantile Regression. Econometric Society Monographs, Vol. 38. Cambridge University Press, Cambridge (2005). https://doi.org/10.1017/CBO9780511754098.

• Lehmann, EL: Some concepts of dependence. Ann. Math. Statist. 37, 1137–1153 (1966). https://doi.org/10.1214/aoms/1177699260.

• Marshall, AW, Olkin, I, Arnold, BC: Inequalities: Theory of Majorization and Its Applications. 2nd edn. Springer Series in Statistics. Springer (2011). https://doi.org/10.1007/978-0-387-68276-1.

• Müller, A: Integral probability metrics and their generating classes of functions. Adv. in Appl. Probab. 29(2), 429–443 (1997).

• Nachbin, L: Topology and Order. Translated from the Portuguese by Lulu Bechtolsheim. Van Nostrand Mathematical Studies, No. 4. D. Van Nostrand Co., Inc., Princeton, N.J.-Toronto, Ont.-London (1965).

• Nelsen, RB: An Introduction to Copulas. 2nd edn. Springer Series in Statistics. Springer, New York (2006).

• Rachev, ST, Rüschendorf, L.: Approximation of sums by compound Poisson distributions with respect to stop-loss distances. Adv. in Appl. Probab. 22(2), 350–374 (1990).

• Rachev, ST: Probability Metrics and the Stability of Stochastic Models. Wiley Series in Probability and Mathematical Statistics: Applied Probability and Statistics. John Wiley & Sons, Ltd., Chichester (1991).

• Rachev, ST, Rüschendorf, L: Mass Transportation Problems. Vol. I. Probability and its Applications (New York), Vol. 1. Springer-Verlag, New York (1998). Theory.

• Rachev, ST, Klebanov, LB, Stoyanov, SV, Fabozzi, FJ: The Methods of Distances in the Theory of Probability and Statistics. Springer (2013). https://doi.org/10.1007/978-1-4614-4869-3.

• Rosenberger, J, Gasko, M: Understanding robust and exploratory data analysis. Wiley Classics Library. Wiley-Interscience, New York (2000). Chap. Comparing Location Estimators: Trimmed Means, Medians, and Trimean. Revised and updated reprint of the 1983 original.

• Rüschendorf, L.: Monotonicity and unbiasedness of tests via a.s. constructions. Statistics. 17(2), 221–230 (1986). https://doi.org/10.1080/02331888608801931.

• Rüschendorf, L: Fréchet bounds and their applications. In: Dall’Aglio, G, Kotz, S, Salinetti, G (eds.)Advances in Probability Distributions with Given Marginals: Beyond the Copulas, pp. 151–187. Springer, Dordrecht (1991). https://doi.org/10.1007/978-94-011-3466-8.

• Rüschendorf, L.: On the distributional transform, Sklar’s theorem, and the empirical copula process. J. Statist. Plann. Inference. 139(11), 3921–3927 (2009). https://doi.org/10.1016/j.jspi.2009.05.030.

• Rüschendorf, L.: Mathematical Risk Analysis. Springer Series in Operations Research and Financial Engineering. Springer (2013). https://doi.org/10.1007/978-3-642-33590-7. Dependence, risk bounds, optimal allocations and portfolios.

• Sriperumbudur, BK, Fukumizu, K, Gretton, A, Schölkopf, B, Lanckriet, GRG: On the empirical estimation of integral probability metrics. Electron. J. Stat. 6, 1550–1599 (2012).

• Strassen, V: The existence of probability measures with given marginals. Ann. Math. Statist. 36, 423–439 (1965). https://doi.org/10.1214/aoms/1177700153.

• Tchen, AH: Inequalities for distributions with given marginals. Ann. Probab. 8(4), 814–827 (1980).

• Villani, C: Topics in Optimal Transportation. Graduate Studies in Mathematics, Vol. 58. American Mathematical Society (2003). https://doi.org/10.1007/b12016.

• Zolotarev, VM: Modern Theory of Summation of Random Variables. Modern Probability and Statistics. VSP, Utrecht (1997). https://doi.org/10.1515/9783110936537.

## Author information

Authors

### Corresponding author

Correspondence to Olivier P. Faugeras.

## Ethics declarations

### Competing interests

The authors declare that they have no competing interests. 