Skip to main content

Affine processes under parameter uncertainty

A Publisher Correction to this article was published on 26 August 2019

This article has been updated


We develop a one-dimensional notion of affine processes under parameter uncertainty, which we call nonlinear affine processes. This is done as follows: given a set Θ of parameters for the process, we construct a corresponding nonlinear expectation on the path space of continuous processes. By a general dynamic programming principle, we link this nonlinear expectation to a variational form of the Kolmogorov equation, where the generator of a single affine process is replaced by the supremum over all corresponding generators of affine processes with parameters in Θ. This nonlinear affine process yields a tractable model for Knightian uncertainty, especially for modelling interest rates under ambiguity.

We then develop an appropriate Itô formula, the respective term-structure equations, and study the nonlinear versions of the Vasiček and the Cox–Ingersoll–Ross (CIR) model. Thereafter, we introduce the nonlinear Vasiček–CIR model. This model is particularly suitable for modelling interest rates when one does not want to restrict the state space a priori and hence this approach solves the modelling issue arising with negative interest rates.

1 Introduction

The modelling of a dynamic and unpredictable phenomenon like stock markets or interest rate markets is often approached via choosing an appropriate stochastic model. In many cases, the choice of the model is a delicate and difficult question. In complex dynamic environments like financial markets it is rather the rule than the exception that unforeseen events lead to difficulties with the a priori chosen model and improvements of the model have to be developed and implemented.

A promiment example in this direction is the role of affine short-rate models in the last 20 years: around 2000, the property of the Vasiček model that interest rates can become negative was heavily critizied and the non-negative Cox–Ingersoll–Ross (CIR) model was preferred, the consequences of the financial crises in 2007–2008 leading to negative interest rates in the Euro zone rendered the CIR model no longer applicable and led to a resurgence of the Vasiček model.

This example illustrates the important question of model uncertainty, which is one of the most important topics in applied sciences and, in particular, plays a prominent role in finance, not only since the financial crisis. The apparent risk of losses due to model mis-specification, called model risk, fostered the development of strategies which are robust against model risk, typically leading to nonlinear pricing rules. These robust strategies play a prominent role in the literature, see Denis and Martini (2006); Cont (2006); Eberlein et al. (2014); Madan (2016), Acciaio et al. (2016); Muhle-Karbe and Nutz (2018); Bielecki et al. (2018), and the book Guyon and 50 Henry-Labordère (2013), to name just a few references in this direction.

A key observation in these works is that the single probability measure used in the classical approaches to specify a model must be replaced by a family of probability measures (i.e., a full class of models). Such an approach is very natural from the statistical viewpoint: when a model has certain parameters to be estimated, the estimators carry statistical uncertainty and one considers confidence intervals instead, corresponding to a family of probability measures. The latter formulation of model risk is typically referred to as parameter uncertainty, see Avellaneda et al. (1995); Wilmott and Oztukel (1998), and Fouque and Ren (2014), and is a major motivation for our research.

Examples in this direction are the notions of g-Brownian motion and G-Brownian motion referring to a Brownian motion with drift or volatility uncertainty, see Peng (1997); Peng (2007a); Peng (2007b) and references therein. Most recently, this theory has been extended to more general approaches, so-called nonlinear Lévy processes, see Neufeld and Nutz (2017) and Denk et al. (2017) in this regard.

Here, we generalize this notion to affine processes under parameter uncertainty (called nonlinear affine processes). While a classical affine process corresponds to a single semimartingale law, we represent the affine process under parameter uncertainty by a family of semimartingale laws whose differential characteristics are bounded from above and below by affine functions of the current states. Nonlinear Lévy processes constitute the special case where the bounds do not depend on the state of the process. It seems important to stretch that for affine processes the bounds on drift and volatility are allowed to depend on the state of the process (in an affine way, however). On the contrary, this state dependence leads to a number of additional difficulties and we therefore restrict ourselves to the simplest case, namely, the one-dimensional case without jumps.

It is our aim to provide the appropriate tools for incorporating parameter uncertainty in the prominent class of affine models. This naturally leads to a nonlinear version of affine processes and associated nonlinear expectations. After having established a dynamic programming principle, we establish the connection to the nonlinear Kolmogorov equation. This allows us to study a number of interesting further developments, a nonlinear version of the Itô-formula, and nonlinear affine term structure equations.

We also provide a number of examples: in addition to a nonlinear variant of the Black–Scholes model and nonlinear Vasiček and CIR models we also introduce a nonlinear Vasiček–CIR model. In the latter model one can incorporate negative interest rates in combination with a CIR-like behaviour, solving the problem raised in many practical applications when the state space needed to be restricted to positive interest rates (see Carver (2012)).

The paper is organized as follows: in Section 2, we introduce nonlinear affine processes. In Section 3 we prove that a dynamic programming principle holds. Section 4 provides the nonlinear Kolmogorov equation and, as examples, the nonlinear Vasiček model and the nonlinear CIR model. In Section 5, we provide a nonlinear Itô formula together with some examples and, in Section 6, we study (nonlinear) affine term structure models. Section 7 studies the application to model risk and Section 8 concludes.

2 Setup

We begin with a short review of continuous affine processes in one dimension. For a detailed exposition we refer to Duffie et al. (2003) and Filipović (2009). Consider the canonical state space, which is either \(\mathcal {X} = \mathbb {R}\) or \(\mathcal {X}=\mathbb {R}_{>0}\). A (time-homogeneous) Markov process X with values in the state space \(\mathcal {X}\) is called affine if the conditional characteristic function of X is exponentially affine. This means that there exist \(\mathbb {C}\)-valued functions ϕ(t,u) and ψ(t,u), respectively, such that

$$\mathbb{E}\left[e^{{uX}_{T}}\mid X_{t}\right] = e^{\phi (T-t,u)+\psi(T-t,u)X_{t} } $$

for all complex \(u \in \{ ix: x \in \mathbb {R}\}, 0 \le t \le T\). The key for our nonlinear formulation will be a characterization of X in terms of stochastic differential equations: more precisely, the affine process X is the unique strong solution of the stochastic differential equation

$$ {dX}_{t} = \left(b^{0}+b^{1} X_{t}\right)dt + \sqrt{a^{0}+a^{1} X_{t}} {dW}_{t}, \qquad X_{0}=x $$

where the drift parameter b0+b1Xt and the diffusion parameter a0+a1Xt depend on the current value of X in an affine way. Here, the process W is a standard Brownian motion. It should be noted that, depending on the state space, not all parameter combinations are possible, but only those combinations which are admissible in the sense made precise in Theorem 10.2 in Filipović (2009). For our case, this implies that if on the one side \(\mathcal {X}=\mathbb {R}\), we necessarily have a1=0 and a0>0 and, on the other side, if \(\mathcal {X}=\mathbb {R}_{>0}\), we obtain a0=0,a1>0, and b0>0. In addition, the coefficients ϕ and ψ solve ODEs (classified as Riccati equations), which is the essence for the high degree of tractability of affine processes in the sense that explicit calculations are possible or efficient numerical methods are obtainable; see Duffie et al. (2003); Filipović (2009) for details and applications in this regard.

2.1 Nonlinear affine processes

In this section, we introduce the necessary tools for defining affine processes under parameter uncertainty. To this end, fix a final time horizon T>0 and let Ω=C([0,T]) be the canonical space of continuous, one-dimensional paths. We endow Ω with the topology of uniform convergence and denote by \({\mathscr F}\) its Borel σ-field. Let X be the canonical process Xt(ω)=ωt, and let \(\mathbb {F}= ({\mathscr F}_{t})_{t \geq 0}\) with \({\mathscr {F}}_{t} = \sigma (X_{s}, 0 \leq s \leq t)\) be the (raw) filtration generated by X.

As we are interested in semimartingale laws on Ω, we begin by denoting by \(\mathfrak {P} (\Omega)\) the Polish space of all probability measures on Ω equipped with the topology of weak convergenceFootnote 1. The process X will be called a (continuous) P-\(\mathbb {F}\)-semimartingale, for \(P \in \mathfrak {P}(\Omega)\), if there exist processes B=BP and M=MP such that X=X0+B+M, where B has continuous paths of (locally) finite variation P-a.s., M is a continuous P-\(\mathbb {F}\)-local martingale and B0=M0=0.

It will be important in the following that, by Proposition (2.2) in Neufeld and Nutz (2014), X is a P-\(\mathbb {F}\)-semimartingale if and only if it is a P-semimartingale with respect to the right-continuous filtration \(\mathbb {F}_{+}=({\mathscr {F}}_{t+})_{t \ge 0}\) or with respect to the usual augmentation \(\mathbb {F}_{+}^{P}\); here \({\mathscr {F}}_{t+}=\cap _{s > t}{\mathscr {F}}_{s}\). Hence, in the following, we consider semimartingales with respect to the raw filtration \(\mathbb {F}\).

The P-\(\mathbb {F}\)-characteristics of a continuous semimartingale X=X0+BP+MP in the above representation is the pair (BP,C), where C=〈MP〉. The non-negative process C does not depend on P, as the quadratic variation is a path propertyFootnote 2. For the following, we will focus on semimartingales where the semimartingale characteristics are absolutely continuous (a.c.), i.e., there exist predictable processes βP and α≥0, such that

$$ B^{P} = {\int_{0}^{\cdot}} {\beta^{P}_{s}} ds, \quad C = {\int_{0}^{\cdot}} {\alpha_{s} ds}. $$

A probability measure \(P \in \mathfrak {P}(\Omega)\) is called a semimartingale law for X, if X is a P-\(\mathbb {F}\)-semimartingale. We denote by

$$\mathfrak{P}_{\text{sem}}^{\text{ac}}= \left\{P \in \mathfrak{P}(\Omega) \,|\, X\ \text{is a} P\text{-}\mathbb{F}\text{-semimartingale with a.c.\ characteristics}\right\} $$

the set of all semimartingale laws of X which have absolutely continuous characteristics. In the following, we denote by (βP,α), as in (2), the differential characteristics of X under \(P\in \mathfrak {P}_{\text {sem}}^{\text {ac}}\).

Our main goal is to allow for a specific version of model risk in the sense that there is uncertainty on the parameter vector θ=(b0,b1,a0,a1) of the affine process. We assume that there is additional information on bounds on the parameter vector θ and denote these finite bounds by \(\underline b^{i}, \bar b^{i}, \underline a^{i}, \bar a^{i}, i=0,1\), respectively. This leads to the compact set

$$ \Theta = \left[\underline b^{0},\bar b^{0}\right] \times \left[\underline b^{1},\bar b^{1}\right] \times \left[\underline a^{0},\bar a^{0}\right] \times \left[\underline a^{1}, \bar a^{1}\right] \subset \mathbb{R}^{2} \times \mathbb{R}_{\ge 0}^{2}. $$

We are interested in the intervals generated by the associated affine functions. In this regard, let \(B:=\left [\underline b^{0},\bar b^{0}\right ] \times \left [\underline b^{1},\bar b^{1}\right ]\) and \(A:=\left [\underline a^{0},\bar a^{0}\right ] \times \left [\underline a^{1},\bar a^{1}\right ]\). Moreover, we denote for \(a \in \mathbb {R}^{2}, a:=(a^{0},a^{1})\) and similarly for \(b\in \mathbb {R}^{2}\). Furthermore, let

$$ \begin{aligned} b^{*}(x) &:= \left\{b^{0}+b^{1}x: b \in B \right\}, \\ a^{*}(x) &:= \left\{a^{0}+a^{1}x^{+}: a \in A \right\} \end{aligned} $$

for \(x \in \mathbb {R}\) denote the associated set-valued functions. As the state space will, in general, be \(\mathbb {R}\), we have to ensure non-negativity of the quadratic variation which is achieved using (·)+:= max{·,0} in the definition of a. Due to the nice structure of Θ, the sets are always intervals: indeed,

$$ \begin{aligned} b^{*}(x) &= \left[\underline b^{0}+ \left(\underline b^{1} {\mathbbm{1}}_{\{{x \ge 0}\}} + \bar b^{1} {\mathbbm{1}}_{\{{x <0}\}}\right) x, \bar b^{0}+ \left(\bar b^{1} {\mathbbm{1}}_{\{{x \ge 0}\}} + \underline b^{1} {\mathbbm{1}}_{\{{x <0}\}}\right) x \right], \\ a^{*}(x) &= [\underline a^{0}+\underline a^{1} x^{+}, \bar a^{0}+ \bar a^{1} x^{+}]. \end{aligned} $$

Clearly, it is possible to consider a more general Θ, however, this is not our focus here.

Definition 1

Let Θbe a set as in (3) with associated a, b as in (4), let t[0,T], and let \(P\in \mathfrak {P}_{\text {sem}}^{\text {ac}}\) be a semimartingale law. We call Paffine-dominated on (t,T] by Θ, if (βP,α) satisfy

$$ \beta^{P}_{s} \in b^{*}(X_{s}), \quad\ \text{and}\ \quad \alpha_{s} \in a^{*}(X_{s}), $$

for dPdt-almost all (ω,s)Ω×(t,T]. If t=0, we call Paffine-dominated by Θ.

An affine process can be uniquely characterized by its transition probabilities. We will make use of this fact for characterizing affine processes under model uncertainty. Moreover, we denote by \({\mathscr {O}}\) the considered state space, which will be either \(\mathbb {R}, \mathbb {R}_{\ge 0}\), or \(\mathbb {R}_{>0}\).

Definition 2

Let Θ be a set as in (3) with associated a, b as in (4). A nonlinear affine process starting at \(x\in {\mathscr {O}}\) is the family of semimartingale laws \(P \in \mathfrak {P}_{\text {sem}}^{\text {ac}}\), such that

  1. (i)


  2. (ii)

    P is affine-dominated by Θ.

As explained in the introduction, parameter uncertainty is represented by a family of models replacing the single model in the approaches without uncertainty: according to Definition 2, the affine process under parameter uncertainty is represented by a family of semimartingale laws instead of a single one. We denote the semimartingale laws \(P\in \mathfrak {P}_{\text {sem}}^{\text {ac}}\), satisfying P(X0=x)=1 and being affine-dominated by Θ by \({\mathcal {A}}(x,\Theta)\). Intuitively, this corresponds to a nonlinear affine process starting in x.

It is well known that the state space \({\mathscr {O}}\) needs to be chosen in correspondence with the choice of Θ: indeed, the squared Bessel process is an affine process with state space \(\mathbb {R}_{>0}\) (see, for example, Karatzas and Shreve (1988), Prop. 3.22 of Chap. 3) and the set \({\mathcal {A}}(x,\Theta)\) will be empty for x<0. To exclude additional difficulties in this direction, we call a family of nonlinear affine processes \(({\mathcal {A}}(x,\Theta))_{x \in {\mathscr {O}}}\) with state space \({\mathscr {O}}\)proper, if either a0>0 holds, or \(\underline a^{0}=\bar a^{0}=0\) and \(\underline b^{0} \ge \nicefrac {\bar a^{1}} 2>0\).

It is clear that in the case with \({\mathscr {O}}=\mathbb {R}\) the assumption \(\underline a^{0} > 0\) is sufficient for reaching the full state space. The case with non-negative state spaces is more delicate. We concentrate on the case \({\mathscr {O}}=\mathbb {R}_{>0}\). The following proposition gives a sufficient condition in this regard. Moreover, it shows that the nonlinear affine process does not reach zero, in the sense that the event of reaching 0 has zero probability under all \(P \in {\mathcal {A}}(x,\Theta)\).

Proposition 1

Let x>0, and assume that \(\underline a^{0}=\bar a^{0}=0\) and \(\underline b^{0} \ge \nicefrac {\bar a^{1}} 2>0\). Then, for any \(P\in {\mathcal {A}}(x,\Theta)\), it holds that

$$P(X_{t} >0, \ 0 \le t \le T) = 1. $$


Let \(P \in \mathcal {A}(x,\Theta)\), denote by βP and α the associated processes from Eq. 2, and denote by MP the P-local martingale part of the P-semimartingale X. Moreover, for any c≥0 define

$$\tau_{c}:=\inf\{t\geq 0\colon X_{t}=c\}. $$

We need to show for any time \({\mathcal {T}}>0\) that \(P[\tau _{0}\leq {\mathcal {T}}]=0\). To that end, fix an arbitrary \({\mathcal {T}}>0\). We adopt the method in Gikhman (2011) to our setting: let ε such that 0<ε<x. Note that by continuity of the paths of X, we have that Xε>0 on [ [0,τε] ]. Moreover, by assumption,

$$ m := \frac{2\underline b^{0} - \bar a^{1}}{\bar a^{1}} > 0. $$

Itô’s formula yields for any t≥0 that

$$\begin{aligned} (X_{\tau_{\varepsilon} \wedge t})^{-m} & \,=\, (X_{0})^{-m} -\! \int_{0}^{\tau_{\varepsilon} \wedge t} m X_{s}^{-(m+1)} \beta^{P}_{s} \,ds \,+\, {\frac{1}{2}} \int_{0}^{\tau_{\varepsilon} \wedge t} m(m+1) X_{s}^{-(m+2)} \alpha_{s} \,ds \\ & \quad -\int_{0}^{\tau_{\varepsilon} \wedge t} m X_{s}^{-(m+1)} \,dM^{P}_{s}. \end{aligned} $$

Define the processes M and \(\tilde M\) by

$$\begin{aligned} M_{t} & =-\int_{0}^{t} m X_{s}^{-(m+1)} \,dM^{P}_{s}, \qquad t\geq 0,\\ \tilde M_{t} & = \int_{0}^{t} (X_{\tau_{\varepsilon} \wedge s})^{-(m+1)} dM^{P}_{s}, \qquad t\geq 0. \end{aligned} $$

Clearly, both M and \(\tilde M\) are local martingales and \(M=-m\tilde M\) on [ [0,τε] ]. We now show that \(\tilde M\) is a true martingale. By the Burkholder–Davis–Gundy inequality, since Xε on [ [0,τε] ], we obtain that

$$\begin{aligned} E\left[\sup_{0 \le s \le T} |\tilde M_{s}|\right] &\le C E\left[\left(\int_{0}^{T} X_{\tau_{\varepsilon}\wedge s}^{-2(m+1)} \alpha_{\tau_{\varepsilon}\wedge s} ds \right)^{\nicefrac 1 2 }\right] \\ &\le C E\left[\left(\bar a^{1} \int_{0}^{T} X_{\tau_{\varepsilon}\wedge s}^{-2(m+1)+1} ds \right)^{\nicefrac 1 2 }\right] \\ &\le C (\bar a^{1} \varepsilon^{-2(m+1)+1}T)^{\nicefrac{1}{2}}< \infty \end{aligned} $$

and hence \(\tilde M\) is indeed a martingale. Therefore, M is a true martingale on [ [0,τε] ]. Next, since \(P\in {\mathcal {A}}(x,\Theta)\) and m>0, we obtain the estimate

$$\begin{aligned} & X_{\tau_{\varepsilon} \wedge t}^{-m}\\ & \le x^{-m} - \int_{0}^{\tau_{\varepsilon} \wedge t} m X_{s}^{-(m+1)} \left(\underline b^{0} + \underline b^{1} X_{s}\right) \, ds\\ &\quad + {\frac{1}{2}} \int_{0}^{\tau_{\varepsilon} \wedge t} m(m+1) X_{s}^{-(m+2)} \bar a^{1} X_{s} \, ds + M_{\tau_{\varepsilon} \wedge t}\\ & = x^{-m} - m \underline b^{1} \int_{0}^{\tau_{\varepsilon} \wedge t} X_{s}^{-m} \, ds + \int_{0}^{\tau_{\varepsilon} \wedge t} \left(\tfrac{m(m+1)\bar a^{1}}{2}-m\underline b^{0}\right) X_{s}^{-(m+1)} \, ds + M_{\tau_{\varepsilon} \wedge t}. \end{aligned} $$

Taking expectations and using that Xε>0 on [ [0,τε] ] yields, in view of (7), for all t≥0 that

$$ {\begin{aligned} E\left[(X_{\tau_{\varepsilon}\wedge t})^{-m}\right] & \le x^{-m} - m \underline b^{1} \int_{0}^{\tau_{\varepsilon} \wedge t} E\left[(X_{s})^{-m}\right] ds\! & \le x^{-m} + m |\underline b^{1}| \int_{0}^{\tau_{\varepsilon} \wedge t} E\left[(X_{s})^{-m}\right] \,ds \end{aligned}} $$

Moreover, as Xε>0 on [ [0,τε] ], we obtain that

$${\begin{aligned} m |\underline b^{1}| \int_{0}^{\tau_{\varepsilon} \wedge t} E\left[(X_{s})^{-m}\right] \,ds = m |\underline b^{1}| \int_{0}^{t} {\mathbbm{1}}_{\{{s \le \tau_{\epsilon}}\}} E\left[(X_{\tau_{\varepsilon}\wedge s})^{-m}\right] \,ds \leq m |\underline b^{1}|\int_{0}^{t} E\left[(X_{\tau_{\varepsilon} \wedge s })^{-m}\right] \,ds. \end{aligned}} $$

This together with (8) yields for any t≥0 that

$$E\left[\left(X_{\tau_{\varepsilon} \wedge t}\right)^{-m}\right] \le x^{-m} + m |\underline b^{1}|\int_{0}^{t} E\left[(X_{\tau_{\varepsilon}\wedge s})^{-m}\right] \,ds. $$

By means of Gronwall’s inequality, we obtain for all t≥0 that

$$ \begin{aligned} E\left[\left(X_{t \wedge \tau_{\varepsilon}}\right)^{-m}\right] & \le x^{-m}e^{m |\underline b^{1}| t}. \end{aligned} $$

In particular, we obtain by Chebyshev’s inequality for our fixed time \({\mathcal {T}}>0\) that

$$\begin{aligned} P(\tau_{\varepsilon} < {\mathcal{T}}) &= P(X_{\tau_{\varepsilon} \wedge {\mathcal{T}}} \le \varepsilon) = P\left(\left(X_{\tau_{\varepsilon} \wedge {\mathcal{T}}}\right)^{-m} \ge \varepsilon^{-m}\right) \le \varepsilon^{m} E\left[(X_{\tau_{\varepsilon} \wedge {\mathcal{T}}})^{-m}\right]. \end{aligned} $$

Inserting (9) and letting ε tend to zero yields the claim. The proof of Proposition 1 is thus complete. □

3 Dynamic programming

One of the key insights for Markov processes is the deep link between Markov processes and their expectations to partial differential equations given by the Kolmogorov equation. In this section, we generalize this relation to the case with parameter uncertainty, i.e., we develop the relation of the nonlinear affine process to a nonlinear version of the Kolmogorov equation. The path we detail in this section uses dynamic programming and the results obtained in Nutz and van Handel (2013) and El Karoui N and Tan (2013a). The key to dynamic programming is a certain stability property under conditioning and pasting. As the nonlinear affine processes we considered up to now always start from time t=0, we introduce the appropriate conditional formulations first.

For the remainder of the section, fix Θ as in (3) with associated a and b as in (4). Denote by \({\mathcal {A}}(t,x,\Theta)\) the semimartingale laws \(P\in \mathfrak {P}_{\text {sem}}^{\text {ac}}\), such that

  1. (i)

    P(Xt=x)=1, and

  2. (ii)

    P is affine-dominated on (t,T] by Θ.

The following result yields measurability of the nonlinear affine process starting at t in ω(t).

Lemma 1

The set \(\big \{(\omega,t,P) \in \Omega \times [0,T]\times \mathfrak {P}(\Omega)\,\big |\, P\in {\mathcal {A}}(t,\omega (t),\Theta) \big \}\) is Borel.


By Theorem 2.6 in Neufeld and Nutz (2014), the set \(\mathfrak {P}_{\text {sem}}^{\text {ac}}\) is Borel, which proves that

$$\begin{aligned} & \ \left\{(\omega,t,P) \in \Omega\times [0,T]\times \mathfrak{P}_{\text{sem}}^{\text{ac}}\,\big|\, P(X_{t} =\omega(t)) =1 \right\} \end{aligned} $$

is Borel. Moreover, Theorem 2.6 in Neufeld and Nutz (2014), also grants the existence of a Borel-measurable map

$$(P,\widetilde\omega,s) \mapsto (\beta_{s}^{P}(\widetilde\omega),\alpha_{s}(\widetilde \omega))$$

such that (βP,α) are the differential characteristics of X under P. Therefore, we obtain the Borel measurability of the set

$$\begin{aligned} G&:=\left\{(\omega,t,P,\widetilde\omega,s) \in \Omega \times [0,T]\times \mathfrak{P}_{\text{sem}}^{\text{ac}}\times\Omega \times[0,T]\, \bigg|\, s>t,\, \right.\\ & \left.{\vphantom{\left\{(\omega,t,P,\widetilde\omega,s) \in \Omega \times [0,T]\times \mathfrak{P}_{\text{sem}}^{\text{ac}}\times\Omega \times[0,T]\, \bigg|\, s>t,\, \right.}}\qquad \qquad P(X_{t} =\omega(t))=1, \left(\beta^{P}_{s}(\widetilde\omega),\alpha_{s}(\widetilde\omega)\right) \in b^{*}(X_{s}(\widetilde\omega))\times a^{*}(X_{s}(\widetilde\omega))\right\}. \end{aligned} $$

Applying Fubini’s theorem yields the Borel measurability of the set

$${\begin{aligned} G':=\left\{(\omega,t,P,\widetilde\omega) \in \Omega\times [0,T]\times \mathfrak{P}_{\text{sem}}^{\text{ac}}\times\Omega\,\Big|\,\int_{0}^{T} \mathbf{1}_{G}(\omega,t,P,\widetilde\omega,s)\,\mathbf{1}_{(t,T]}(s)\,ds =T-t\right\}. \end{aligned}} $$

Now, observe that

$$\begin{aligned} &\left\{(\omega,t,P) \in \Omega\times \quad[0,T]\times \mathfrak{P}(\Omega)\,\big|\, P\in {\mathcal{A}}(t,\omega(t),\Theta)\right\}\\ &\quad= \ \left\{(\omega,t,P) \in \Omega\times [0,T]\times \mathfrak{P}_{\text{sem}}^{\text{ac}}\,\big|\,E^{P}[\mathbf{1}_{G^{\prime}}(\omega,t,P,\cdot)]=1\right\}. \end{aligned} $$

The right side is Borel measurable due to a monotone class argument as in ((Neufeld and Nutz 2014), Lemma 3.1). □

Lemma 2

Fix \((x,t) \in \mathbb {R}\times [0,T], P \in {\mathcal {A}}(t,x,\Theta)\) and a stopping time τ taking values in [t,T].

  1. (i)

    There exists a family of conditional probabilities (Pω)ωΩ of P with respect to \({\mathscr {F}}_{\tau }\) such that \(P_{\omega } \in {\mathcal {A}}(\tau (\omega),\omega _{\tau (\omega)},\Theta)\) for P-a.e. ωΩ.

  2. (ii)

    Assume that there exists a family of probability measures (Qω)ωΩ such that \(Q_{\omega } \in {\mathcal {A}}(\tau (\omega),\omega _{\tau (\omega)},\Theta)\) for P-a.e. ωΩ, and the map ωQω is \({\mathscr F}_{\tau }\) measurable. Then, the probability measure PQ defined by

    $$P\otimes Q(\,\cdot\,) =\int_{\Omega} Q_{\omega}(\,\cdot\,)\,P(d\omega) $$

    is an element of \({\mathcal {A}}(t,x,\Theta)\).


This follows directly from ((Neufeld and Nutz 2017), Theorem 2.1), see also ((Guo et al. 2017), Lemma 4.6). □

Now, fix a Borel-measurable function \(\psi :{\mathscr {O}} \to \mathbb {R}\) and define the value function \(v:[0,T]\times {\mathscr {O}}\to \mathbb {R}\) by

$$v(t,x):=\sup_{P\in {\mathcal{A}}(t,x,\Theta)} E^{P}[\psi (X_{T})]. $$

Using the above results, we obtain the following dynamic programming principle.

Proposition 2

Consider a proper family of nonlinear affine processes with state space \({\mathscr {O}}\). For any \((t,x) \in [0,T]\times {\mathscr {O}} \) and any stopping time τtaking values in [t,T], we obtain

$$v(t,x)= \sup_{P\in {\mathcal{A}}(t,x,\Theta)}E^{P}\left[v(\tau,X_{\tau})\right]. $$


The result follows from Theorem 2.1 in El Karoui and Tan (2013b) (with \(\overline {\mathcal {P}}_{t,x}={\mathcal {A}}(t,x,\Theta)\) in the notation of El Karoui and Tan (2013b)) by noting that analyticity is implied by measurability as shown in Lemma 1 and the required stability assumptions have been shown in Lemma 2. □

3.1 Continuity of the value function

In the following, we show the continuity of the value function v(t,x). To this end, introduce the constant

$$ \mathcal{K}=|\underline{b}^{0}|+|\underline{b}^{1}|+|\bar{b}^{0}|+|\bar{b}^{1}|+\bar{a}^{0}+\bar{a}^{1}, $$

which is finite by Assumption (3). The following inequality is the cornerstone of the results in this section.

Lemma 3

Consider a proper family of nonlinear affine processes with state space \({\mathscr {O}}\) and let q≥1. There exists an 0<εε(q)<1 such that for all 0<hε, all t[0,Th] and all \(x\in {\mathscr {O}}\), we have

$$\sup_{P \in {\mathcal{A}}(t,x,\Theta)}E^{P}\left[\sup_{0\leq s\leq h} |X_{t+s}-x|^{q}\right]\leq C\left(h^{q}+h^{q/2}\right) $$

for some constant C=C(x,q)>0 which may depend on x, but is independent of h and t.


Let q[1,). Consider \(P \in {\mathcal {A}}(t,x,\Theta)\) and denote by \(X_{s}=x+B^{P}_{s}+M^{P}_{s}, s \ge t\), the semimartingale representation of X with predictable finite-variation part BP and local martingale MP. In the following, we will repeatedly use the elementary inequality that

$$ \begin{aligned} (a_{1}+a_{2})^{q} \le 2^{q-1}(a_{1}^{q} + a_{2}^{q}) \end{aligned} $$

and denote cq:=2q−1.

The Burkholder–Davis–Gundy (BDG) inequality (see Theorem 4.1 in Revuz and Yor (1999)) together with Jensen’s inequality and (11) yields for any h[0,Tt] that

$$ \begin{aligned} &E^{P}\left[\sup_{0\leq s \leq h}|X_{t+s}-x|^{q}\right] \leq c_{q} E^{P}\left[\sup_{0\leq s \leq h}\left|M^{P}_{t+s}\right|^{q}\right] + c_{q} E^{P}\left[\sup_{0\leq s \leq h}\left|B^{P}_{t+s}\right|^{q}\right]\\ &\qquad\leq c_{q} \widetilde C_{q} E^{P}\left[\left(\int_{t}^{t+h} \alpha_{u}\,du\right)^{\nicefrac{q}{2}}\right] + c_{q}E^{P}\left[\left(\int_{t}^{t+h} \left|\beta^{P}_{u}\right|\,du\right)^{q}\right]. \end{aligned} $$

Note that the constant \(\widetilde C_{q} \ge 1\) from the BDG inequality does depend on q only.

Let \(\widetilde {\mathcal {K}}:=1+\mathcal K\), where \(\mathcal K\) is the constant defined in (10). Choose any 0<ε=ε(q)<1 small enough such that it satisfies

$$ \begin{aligned} 1-c_{q}^{3}\widetilde C_{q} \widetilde{\mathcal{K}}^{q} \left(\varepsilon^{q}+\varepsilon^{q/2}\right)>0. \end{aligned} $$

Let us verify that such a fixed ε satisfies the desired property: by the very definition of \(P \in {\mathcal {A}}(t,x,\Theta)\), we have on [t,t+h] that both α and |βP| are bounded from above by \(\widetilde {\mathcal {K}}+\widetilde {\mathcal {K}}\sup _{0\leq s \leq h}|X_{t+s}|\ge 1\) since they are affine dominated. This, together with Jensen’s inequality, yields that

$$ {\begin{aligned} E^{P}\left[\left(\int_{t}^{t+h} \alpha_{u}\,du\right)^{\nicefrac{q}{2}}\right] &\leq h^{q/2}E^{P}\left[\left(\widetilde{\mathcal{K}} + \widetilde{\mathcal{K}} \sup_{0\leq s \leq h}|X_{t+s}|\right)^{\nicefrac{q}{2}}\right]\\ &\leq h^{q/2}E^{P}\left[\left(\widetilde{\mathcal{K}} + \widetilde{\mathcal{K}} \sup_{0\leq s \leq h}|X_{t+s}|\right)^{q}\right]\\ &\leq h^{q/2}c_{q}\left(\widetilde{\mathcal{K}}^{q} + \widetilde{\mathcal{K}}^{q} E^{P}\left[\left(\sup_{0\leq s \leq h}|X_{t+s}|\right)^{q}\right]\right)\\ &\leq h^{q/2}c_{q}^{2}\left(\widetilde{\mathcal{K}}^{q} + \widetilde{\mathcal{K}}^{q} |x|^{q} + \widetilde{\mathcal{K}}^{q} E^{P}\left[\left(\sup_{0\leq s \leq h}|X_{t+s}-x|\right)^{q}\right]\right). \end{aligned}} $$

In a similar way, we obtain

$${\begin{aligned} E^{P}\left[\left(\int_{t}^{t+h} |\beta^{P}_{u}|\,du\right)^{q}\right] &\leq h^{q}E^{P}\left[\left(\widetilde{\mathcal{K}}+\widetilde{\mathcal{K}}\sup_{0\leq s \leq h}|X_{t+s}|\right)^{q}\right]\\ &\leq h^{q} c_{q}^{2}\left(\widetilde{\mathcal{K}}^{q} + \widetilde{\mathcal{K}}^{q} |x|^{q} + \widetilde{\mathcal{K}}^{q} E^{P}\left[\left(\sup_{0\leq s \leq h}|X_{t+s}-x|\right)^{q}\right]\right). \end{aligned}} $$

Inserting these inequalities into (12), considering hε, and noting that \(\tilde C_{q} \ge 1\) implies that

$$ {\begin{aligned} &E^{P}\left[\sup_{0\leq s \leq h}|X_{t+s}-x|^{q}\right]\hspace{1.5cm}\\ &\quad \leq c_{q}^{3} \, \widetilde C_{q} \widetilde{\mathcal{K}}^{q}\left(h^{q/2} + h^{q}\right) E^{P}\left[\sup_{0\leq s \leq h}|X_{t+s}-x|^{q}\right] +c_{q}^{3} \, \widetilde C_{q} \widetilde{\mathcal{K}}^{q} (1+|x|^{q})\left(h^{q/2} + h^{q}\right) \\ &\quad \leq c_{q}^{3} \, \widetilde C_{q} \widetilde{\mathcal{K}}^{q}\left(\varepsilon^{q/2} + \varepsilon^{q}\right) E^{P}\left[\sup_{0\leq s \leq h}|X_{t+s}-x|^{q}\right] +c_{q}^{3} \, \widetilde C_{q} \widetilde{\mathcal{K}}^{q} (1+|x|^{q})\left(h^{q/2} + h^{q}\right). \end{aligned}} $$

Since hε and we chose 0<ε<1 such that (13) holds, we obtain for the constant \(C:=\frac {c_{q}^{3} \, \widetilde C_{q} \widetilde {\mathcal {K}}^{q} (1+|x|^{q})}{1-c_{q}^{3} \, \widetilde C_{q} \widetilde {\mathcal {K}}^{q}(\varepsilon ^{q/2} + \varepsilon ^{q})}>0\), being independent of t,h,P, that

$$E^{P}\left[\sup_{0\leq s \leq h}|X_{t+s}-x|^{q}\right] \leq C\,\left(h^{q/2}+h^{q}\right). $$

As \(P \in {\mathcal {A}}(t,x,\Theta)\) was chosen arbitrarily, the claim is proven. □

Remark 1

The proof of Lemma 3, actually shows that for the corresponding 0<ε<1, we have for all 0<hε, all t[0,Th], and all \(x\in \mathbb {R}\), that the local martingale part \((M^{P}_{t+s})_{0\leq s\leq h}\) restricted on [t,t+h] is a true martingale for any \(P \in {\mathcal {A}}(t,x,\Theta)\).

Lemma 4

Consider a proper family of nonlinear affine processes with state space \({\mathscr {O}}\) and let \(\psi :{\mathscr {O}}\to \mathbb {R}\) be Lipschitz continuous with Lipschitz-constant Lψ. Then, the value function

$$v(t,x):=\sup_{P \in {\mathcal{A}}(t,x,\Theta)}E^{P}\left[\psi(X_{T})\right] $$

is jointly continuous. In particular, v(t,·)is Lipschitz continuous with constant Lψ and v(·,x)is locally 1/2-Hölder continuous.


For the Lipschitz-continuity of v(t,·), observe that for any t

$$|v(t,x)-v(t,y)| \leq \sup_{P \in {\mathcal{A}}(t,x,\Theta)}E^{P}\left[|\psi(X_{T})-\psi(y-x+X_{T})|\right]\leq L_{\psi}|y-x|. $$

For the locally 1/2-Hölder continuity, let t[0,T) and 0≤uTt be small enough. Then, the dynamic programming principle derived in Proposition 2, the Lipschitz continuity of v(t,·), and Lemma 3 yield

$$\begin{aligned} |v(t,x)-v(t+u,x)| &=\left|\sup_{P \in {\mathcal{A}}(t,x,\Theta)}E^{P}\left[v(t+u,X_{t+u})-v(t+u,x)\right]\right|\\ &\leq L_{\psi} \sup_{P \in {\mathcal{A}}(t,x,\Theta)}E^{P}\left[|X_{t+u}-x|\right]\\ &\leq L_{\psi} C(x,1)\,\left(u^{1/2}+u\right) \end{aligned} $$

with constant C(x,1) from Lemma 3. Finally, consider a sequence (tn,xn) converging to (t,x). Then,

$$\begin{aligned} | v(t_{n},x_{n}) - v(t,x) | & \le |v(t_{n},x_{n})-v(t_{n},x)| + | v(t_{n},x)-v(t,x)| \\ & \le L_{\psi} |x_{n} - x| + L_{\psi} C(x,1) \left(|t_{n}-t|^{\nicefrac{1}{2}} + |t_{n}-t| \right). \end{aligned} $$

Letting n go to infinity yields the result. □

4 The Kolmogorov equation

In this section, we provide the link between the nonlinear affine process and the associated nonlinear Kolmogorov equation. More precisely, we relate the nonlinear affine process to a (fully) nonlinear partial differential equation (PDE). This is achieved by a probabilistic construction involving an optimal control problem on the canonical space of continuous paths where the controls are laws of affine-dominated semimartingales. To this end, note that the affine process X given in Eq. 1 is uniquely characterized by its infinitesimal generator,

$$ \begin{aligned} {\mathscr{L}}^{\theta} f(x)= \left(b^{0} + b^{1}x\right) \partial_{x} f(x)+ {\frac{1}{2}} \left(a^{0} + a^{1} x^{+}\right)\partial_{xx}f(x). \end{aligned} $$

This is equivalent to \(f(X_{t})-f(x)-\int _{0}^{t} {\mathscr {L}}^{\theta } f(X_{s})ds, \ 0 \le t \le T\) solving the martingale problem, see, e.g., Theorem 21.7 in Kallenberg (2002).

Consider the state space \({\mathscr {O}}\) which will be either \(\mathbb {R}, \mathbb {R}_{\ge 0}\) or \(\mathbb {R}_{>0}\). Fix \(\psi :{\mathscr {O}} \to \mathbb {R}\) and consider the fully nonlinear PDE

$$ \left\{\begin{array}{rl}-\partial_{t} v(t,x)-G\left(x,\partial_{x}v(t,x),\partial_{xx}v(t,x)\right) =0 &\quad \text{on}[0,T)\times {\mathscr{O}}, \\ v(T,x)=\psi(x) \quad &\quad x \in {\mathscr{O}}, \end{array}\right. $$

where \(G:{\mathscr {O}}\times \mathbb {R} \times \mathbb {R} \to \mathbb {R}\) is defined by

$$ \begin{aligned} G(x,p,q):=\sup_{(b^{0},b^{1},a^{0},a^{1})\in \Theta}\left\{(b^{0}+b^{1}x)p+\frac{1}{2}(a^{0}+a^{1}x^{+}) q\right\}. \end{aligned} $$

The function −G satisfies the degenerate ellipticity condition and as Θ is compact, it is also continuous. Observe that the PDE defined in (16) can be seen as nonlinear affine PDE, since for θ:=(b0,b1,a0,a1) it is of the form

$$\left\{\begin{array}{rl}-\partial_{t} v(t,x)-\sup_{\theta \in \Theta}{\mathscr{L}}^{\theta} v(t,x)=0 \quad &\text{on}[0,T)\times {\mathscr{O}}, \\ v(T,x)=\psi(x) & x \in {\mathscr{O}}. \end{array}\right. $$

We write \(C_{b}^{2,3}([0,T)\times {\mathscr {O}})\) for the set of functions on \([0,T)\times {\mathscr {O}}\) having bounded continuous derivatives up to the second and third order in t and x, respectively. An upper semicontinuous function u on \([0,T)\times {\mathscr {O}}\) will be called a viscosity subsolution of (16) if u(T,·)≤ψ(·) and

$$-\partial_{t} \varphi(t,x)-G\left(x,D_{x} \varphi(t,x), D^{2}_{xx} \varphi(t,x)\right)\leq 0 $$

whenever \(\varphi \in C_{b}^{2,3}([0,T)\times {\mathscr {O}})\) is such that φu on \([0,T)\times {\mathscr {O}}\) and φ(t,x)=u(t,x). The definition of a viscosity supersolution is obtained by reversing the inequalities and the semicontinuity. Finally, a continuous function is a viscosity solution if it is both a sub- and a supersolution.

We obtain a stochastic representation for the nonlinear affine PDE (16).

Theorem 1

Consider a proper family of nonlinear affine processes with state space \({\mathscr {O}}\) and let \(\psi :{\mathscr {O}} \to \mathbb {R}\) be Lipschitz continuous. Then,

$$v(t,x):= \sup_{P\in {\mathcal{A}}(t,x,\Theta)}E^{P}\left[\psi(X_{T})\right], \quad x \in {\mathscr{O}} $$

is a viscosity solution of the nonlinear PDE in (16).


The proof essentially follows the well-known standard arguments in stochastic control, see e.g., the proof of ((Neufeld and Nutz 2017), Proposition 5.4).

By Lemma 4, v(t,x) is continuous on \([0,T)\times \mathbb {R}\), and we have v(T,x)=ψ(x) by the definition of v. We show that v is a viscosity subsolution of the nonlinear affine PDE defined in (16); the supersolution property is proved similarly. We remark that in the subsequent lines within this proof, C>0 is a constant whose values may change from line to line.

Let \((t,x) \in [0,T)\times \mathbb {R}\) and let \(\varphi \in C^{2,3}_{b}\left ([0,T)\times \mathbb {R}^{d}\right)\) be such that φv and φ(t,x)=v(t,x). By the dynamic programming principle obtained in Proposition 2, we have for any 0<u<Tt that

$$ \begin{aligned} 0&=\sup_{P\in {\mathcal{A}}(t,x,\Theta)}E^{P}\left[v(t+u,X_{t+u})-v(t,x)\right] \\ &\leq \sup_{P\in {\mathcal{A}}(t,x,\Theta)}E^{P}\left[\varphi(t+u,X_{t+u})-\varphi(t,x)\right]. \end{aligned} $$

Fix any \(P\in {\mathcal {A}}(t,x,\Theta)\), denote as above by (βP,α) the differential characteristics of the continuous semimartingale X under P, and denote by MP the P-local martingale part of the P-semimartingale X. Then, Itô’s formula yields

$$ \begin{aligned} \varphi (t+u,&X_{t+u}) - \varphi(t,x) = \int_{0}^{u} \partial_{t} \varphi(t+s,X_{t+s}) \,ds + \int_{0}^{u} \partial_{x} \varphi(t+s,X_{t+s}) \,dM^{P}_{t+s} \\ & + \int_{0}^{u} \partial_{x} \varphi(t+s,X_{t+s}) \beta_{t+s}^{P} \,ds + \frac{1}{2} \int_{0}^{u} \partial_{xx} \varphi (t+s,X_{t+s}) \alpha_{t+s} \,ds. \end{aligned} $$

As \(\varphi \in C_{b}^{2,3}([0,T)\times \mathbb {R}), \partial _{x} \varphi \) is uniformly bounded, thus by Remark 1, we see that for small enough 0<u<Tt the local martingale part in (19) is in fact a true martingale, starting at 0. In particular, its expectation vanishes. The next step is to estimate the expectation of the other terms. In this regard, note that

$$ \begin{aligned} &E^{P}\left[ \int_{0}^{u} \partial_{x} \varphi(t+s,X_{t+s}) \beta^{P}_{t+s}\, ds\right] \\ &\quad \leq \ \int_{0}^{u} E^{P} \left[ \left| \partial_{x} \varphi(t+s,X_{t+s}) - \partial_{x} \varphi (t,x)\right| \, \left|\beta^{P}_{t+s}\right| + \partial_{x} \varphi(t,x) \beta^{P}_{t+s} \right]\, ds. \end{aligned} $$

Since \(\varphi \in C^{2,3}_{b}\), xφ is Lipschitz. Hence, we obtain with the constant \({\mathcal {K}}\) from Eq. 10 together with Lemma 3 that for small enough u,

$$ \begin{aligned} &\int_{0}^{u} E^{P} \left[ \left| \partial_{x} \varphi(t+s,X_{t+s}) - \partial_{x} \varphi (t,x)\right| \cdot \left|\beta_{t+s}^{P}\right| \right]\, ds \\ &\quad \leq \ C\int_{0}^{u} E^{P} \left[\left(s+\sup_{0\leq v\leq u}|X_{t+v}-x|\right) \cdot \left|\beta^{P}_{t+s}\right| \right]\, ds \\ &\quad \leq \ C\int_{0}^{u} E^{P} \left[\left(s+\sup_{0\leq v\leq u}|X_{t+v}-x|\right)\, \left(\mathcal{K} +\mathcal{K} \sup_{0\leq v\leq u}\left|X_{t+v}\right|\right)\right]\, ds \\ &\quad \leq \ C\int_{0}^{u} E^{P} \left[ \left(s+\sup_{0\leq v\leq u}\left|X_{t+v}-x\right|\right) \left(\mathcal{K} +\mathcal{K} |x| +\mathcal{K} \sup_{0\leq v\leq u}\left|X_{t+v}-x\right|\right)\right] ds \\ &\quad \leq \ C\left(u^{3} +u^{5/2}+ u^{2} + u^{3/2}\right). \end{aligned} $$

Inserting (21) into (20) yields

$$ \begin{aligned} \lefteqn{ E^{P} \left[\int_{0}^{u} \partial_{x} \varphi(t+s,X_{t+s}) \beta^{P}_{t+s}\, ds\right]} \hspace{2cm} \\ & \leq \int_{0}^{u} E^{P}\left[\partial_{x} \varphi(t,x) \,\beta^{P}_{t+s}\right]\, ds + C\left(u^{3} +u^{5/2}+ u^{2} + u^{3/2}\right). \end{aligned} $$

The same argument applied to xxφ leads to

$$ \begin{aligned} \lefteqn{E^{P}\left[\int_{0}^{u} \partial_{xx} \varphi(t+s,X_{t+s})\, \alpha_{t+s}\, ds\right]} \hspace{2cm} \\ & \leq \int_{0}^{u} E^{P}\left[\partial_{xx} \varphi(t,x)\, \alpha_{t+s}\right]\, ds + C\left(u^{3} +u^{5/2}+ u^{2} + u^{3/2}\right). \end{aligned} $$

Moreover, by a similar calculation, we have

$$ \begin{aligned} &E^{P}\left[ \int_{0}^{u} \partial_{t} \varphi(t+s,X_{t+s})\, ds\right] \\ &\quad \leq \ \int_{0}^{u} \partial_{t} \varphi (t,x) \,ds + \int_{0}^{u} E^{P} \left[\left| \partial_{t} \varphi(t+s,X_{t+s}) - \partial_{t} \varphi (t,x)\right| \right] \, ds \\ &\quad \leq \ \int_{0}^{u} \partial_{t} \varphi (t,x) \,ds + C\int_{0}^{u} E^{P} \left[s+\sup_{0\leq v\leq u}\left|X_{t+v}-x\right| \right]\, ds \\ &\quad \leq \ \int_{0}^{u} \partial_{t} \varphi (t,x) \,ds + C\left(u^{2} + u^{3/2}\right). \end{aligned} $$

As above, we write θ:=(b0,b1,a0,a1) for an element in Θ. Then, by taking expectations in (19) and using (20)–(24) yields

$$ \begin{aligned} \lefteqn{E^{P}\left[\varphi (t+u,X_{t+u})- \varphi(t,x)\right] \leq C\left(u^{3} +u^{5/2}+ u^{2} + u^{3/2}\right)} \quad \\ &+ \int_{0}^{u} \left(\partial_{t} \varphi (t,x) + E^{P}\left[ \partial_{x} \varphi(t,x) \,\beta^{P}_{t+s} + \partial_{xx} \varphi(t,x)\, \alpha_{t+s}\right] \right) \,ds \ \\ &\le C\left(u^{3} +u^{5/2}+ u^{2} + u^{3/2}\right) + u \partial_{t} \varphi (t,x) \\ &+ \int_{0}^{u} E^{P}\left[\sup_{\theta \in \Theta} \left\{(b^{0}+b^{1} X_{t+s})\,\partial_{x} \varphi(t,x) + \frac{1}{2} \left(a^{0}+a^{1} X_{t+s}^{+}\right)\,\partial_{xx} \varphi(t,x) \right\}\right]. \end{aligned} $$

Here, the supremum turns out to be G(Xt+s,xφ(t,x),xxφ(t,x)). Note that by the very definition of G,

$$\begin{aligned} G(X_{t+s},p,q) & \le G(x,p,q) + \sup_{\theta \in \Theta}\left\{ |b^{1}| \, |X_{t+s}-x| \, | p| + |a^{1}| \, |X_{t+s}-x| \, | q| \right\}. \end{aligned} $$

Therefore, by using that \(\varphi \in C^{2,3}_{b}\), the definition of the constant \({\mathcal {K}}\) in (10), and Lemma 3, we have

$$ \begin{aligned} &\int_{0}^{u} E^{P}\left[\left. G(X_{t+s},\partial_{x} \varphi(t,x),\partial_{xx}\varphi(t,x)) \right\}\right]\,ds\hspace{2cm} \\ & \quad\leq \ u G(x,\partial_{x} \varphi(t,x),\partial_{xx}\varphi(t,x)) + u C{\mathcal{K}} E^{P}\left[|X_{t+s}-x|\right] \\ & \quad \leq \ u G(x,\partial_{x} \varphi(t,x),\partial_{xx}\varphi(t,x)) + C{\mathcal{K}} \left(u^{2} +u^{3/2}\right). \end{aligned} $$

Combining (25)–(26) yields

$$ \begin{aligned} E^{P}\left[\varphi (t+u,X_{t+u})- \varphi(t,x)\right] &\leq \ u \partial_{t} \varphi (t,x) + u G(x,\partial_{x} \varphi(t,x),\partial_{xx}\varphi(t,x)) \\ & \ + C\left(u^{3} +u^{5/2}+ u^{2} + u^{3/2}\right). \end{aligned} $$

for some constant C>0 which is independent of P. As the choice of \(P \in {\mathcal {A}}(t,x,\Theta)\) was arbitrary, we deduce from (18) that

$$ \begin{aligned} 0 &\leq \sup_{P\in {\mathcal{A}}(t,x,\Theta)}E^{P}\left[\varphi(t+u,X_{t+u})-\varphi(t,x)\right]\\ &\leq u \partial_{t} \varphi (t,x) + u G(x,\partial_{x} \varphi(t,x),\partial_{xx}\varphi(t,x)) + C\left(u^{3} +u^{5/2}+ u^{2} + u^{3/2}\right). \end{aligned} $$

By dividing first in (28) by −u and then letting u go to zero, we obtain that

$$- \partial_{t} \varphi (t,x) - G(x,\partial_{x} \varphi(t,x),\partial_{xx}\varphi(t,x)) \leq 0, $$

which proves that v is indeed a viscosity subsolution as desired. □

4.1 Uniqueness

Uniqueness in our framework is not covered by standard arguments as in Fleming and Soner (2006) or Crandall et al. (1992), since the diffusion coefficient does not satisfy a global Lipschitz condition. This is a well-known difficulty already discussed in Feller (1951). To overcome this, we have to distinguish the two cases where the state space is either \(\mathbb {R}\) or \(\mathbb {R}_{>0}\).

We begin with the general case which covers the nonlinear Vasiček–CIR model discussed in detail in Section 6.1. In this case we do not decide a priori whether a Vasiček or a Cox–Ingersoll–Ross (CIR) model takes place and therefore consider the full space \(\mathbb {R}\) as state space. To achieve this, we assume throughout a non-vanishing volatility, i.e., \(\underline a^{0} >0\). When the process reaches zero from above, this avoids that a deterministic behaviour on \(\mathbb {R}_{<0}\) takes place and that singularities arise. On the other side, we do not need any assumptions on a1.

Proposition 3

Assume that \(\underline a^{0}>0\) and let \({\mathscr {O}}=\mathbb {R}\). Then, for any given continuous and bounded function \(\psi :\mathbb {R} \to \mathbb {R}\), the nonlinear PDE introduced in (16)–(17) admits at most one viscosity solution v(t,x) on \([0,T]\times \mathbb {R}\) satisfying the terminal conditions

$$\begin{aligned} v(T,x)&=\psi(x), \quad x\in \mathbb{R}. \end{aligned} $$


Uniqueness in this case follows by the observation that the coefficients are Lipschitz once \(\underline a^{0}>0\). Indeed, then following the standard procedure detailed in section V.9 in Fleming and Soner (2006) allows to extend the uniqueness results from Corollary V.8.1 therein to an unbounded domain as considered here. □

On the other side, if we consider the case where \(\mathbb {R}_{>0}\) is the state space, we will necessarily require \(\underline a^{0} = \bar a^{0} = 0.\) Then, the (bounds on the) diffusion coefficient no longer satisfy a global Lipschitz property and the standard methodology cannot be applied. To the best of our knowledge, only Costantini et al. (2012) and Amadori (2007) treat this setting, though we apply the techniques from the second article.

In this regard, let

$$ \begin{aligned} h_{\epsilon}(x) &= \left\{\begin{array}{ll} \max\left(\frac 1 {\log(x)}, 1 \right) & \epsilon=0 \\ \max\left(\frac 1 {x^{\epsilon}},1 \right) & \text{otherwise.} \end{array}\right. \end{aligned} $$

Proposition 4

Assume that \(\underline a^{0}=\bar a^{0}=0, \underline b^{0} \ge \nicefrac {\bar a^{1}} 2>0\), let \({\mathscr {O}}=\mathbb {R}_{>0}\), and consider a Lipschitz-continuous \(\psi :{\mathscr {O}} \to \mathbb {R}\).

  1. (i)

    Then, (16)–(17) has a one and only one viscosity solution v on \([0,T]\times {\mathscr {O}}\) satisfying

    $$ \begin{aligned} \sup_{(t,x) \in [0,T]\times {\mathscr{O}}} \frac{|v(t,x)|}{1+x} < \infty. \end{aligned} $$
  2. (ii)

    Let \(\epsilon = \nicefrac {2 \underline b^{0}}{\bar a^{1}} -1 >0\) and suppose that there is ρ,ρ>0 and ε[0,ε) such that \(\frac {|\psi (x)|}{1+ |x|}\) is bounded on (0,) and

    $$ \begin{aligned} x \mapsto \frac{|\psi(x)|}{h_{\epsilon^{\prime}}(x)} \end{aligned} $$

    is bounded from above, respectively, in the set (ρ,) and (0,ρ). Then, (16)–(17) has one and only one viscosity solution v(t,x) on \([0,T]\times {\mathscr {O}}\) satisfying

    $$ \begin{aligned} & \limsup_{x \to \infty} \sup_{t \in [0,T]} \frac{|v(t,x)|}{x^{2}} = 0, \quad {\lim}_{x \to 0} \sup_{t \in [0,T]} \frac{|v(t,x)|}{h_{\epsilon}(x)} = 0. \end{aligned} $$


Positivity follows readily from Proposition 1. This will allow to treat the nonlinear PDE in (16)–(17) on the state space \({\mathscr {O}}=\mathbb {R}_{>0}\). The key for uniqueness is that on compact subsets of \(\mathbb {R}_{>0}\) the square-root is Lipschitz continuous.

To begin with, note that existence of a solution under the Lipschitz property of ψ follows from Theorem 1. Furthermore, Theorem 4 in Amadori (2007) yields the desired uniqueness in our case. We recall this result together with its assumptions in Appendix 2. The claim then follows from Theorem 2. □

Remark 2

Already, the results in the case without uncertainty in Costantini et al. (2012) show that these results do not generalize to \(\mathbb {R}_{\ge 0}\), because then the Lipschitz property on compact subsets which is crucially used in the proof will no longer be satisfied. Moreover, Example 1 in Amadori (2007) shows that the condition \(\underline b^{0} \ge \nicefrac {\bar a^{1}} 2>0\) is indeed necessary for uniqueness.

It is time to accommodate the established results with some first motivating examples. We will begin with classical affine processes under parameter uncertainty, leading to nonlinear affine processes. Thereafter, we show how to extend beyond this case and introduce classes of nonlinear affine processes which do not have a classical counterpart.

4.2 The nonlinear Vasiček model

The first example of an affine model is the so-called Vasiček model, see, for example, Filipović (2009). It is a Gaussian Ornstein–Uhlenbeck process and is obtained as the strong solution of the SDE in Eq. 1 by considering a1=0. Introducing parameter uncertainty, we arrive at the nonlinear affine process with \(\left [\underline a^{1},\bar a^{1}\right ]=\{0\}\). We call this case the nonlinear Vasiček model.

While in the case with no parameter uncertainty, this model can be characterized efficiently by its Fourier transforms and the associated Riccati equation, this is no longer possible here (except for the special case where \(\underline b^{1} = \bar b^{1}\), see Remark 4 below) and one has to rely on numerical techniques. In one dimension, this does not pose a problem, see, for example, Heider (2010). To illustrate this, we solve the equation for the simplest payoff, f(x)=ex. The result is shown in Fig. 1.

Fig. 1
figure 1

This figure shows the solution of the nonlinear Kolmogorov equation for the nonlinear Vasiček-model with boundary condition f(x)=ex. The dashed line shows the solutions for the Vasiček model with parameter-set \((\overline b_{0}, \underline b_{1}, \overline a_{0}) \) (on \(\mathbb {R}_{<0}\)) and \((\overline b_{0}, \overline b_{1}, \overline a_{0}) \) (on \(\mathbb {R}_{\ge 0}\)) and illustrate the nonlinearity in the solution due to the parameter uncertainty. Note that when x is positive, the curves overlap in the figure

This example combines and encompasses the following two well-known nonlinear processes: g-Brownian motion and G-Brownian motion, see Peng (1997) and Peng (2007a), for example, and Neufeld and Nutz (2017) for the case with jumps. In the following, we will also elaborate on the case where explicit solutions can be obtained, see Proposition 5 and Remark 4.

4.3 The nonlinear CIR model

While the Gaussianity of the Vasiček model immediately implies that the process becomes negative with positive probability, this is inappropriate for various applications, e.g., in credit risk. There, the considered affine process models an intensity, which by definition has to be non-negative. Also, positive interest rates were a requirement before the recent crises in 2008–2010. The Cox–Ingersoll–Ross (CIR) model serves as an affine model with state space \(\mathbb {R}_{\ge 0}\) and therefore satisfies these needs. It is obtained by choosing a0=0 and a1>0 in the SDE in (1).

The CIR model under parameter uncertainty, which we call the nonlinear CIR model, is obtained by considering the state space \({\mathscr {O}}=\mathbb {R}_{>0}\) and assuming that \(\underline a^{0}=\bar a^{0}=0, \underline b^{0} \ge \nicefrac {\bar a^{1}} 2>0\). It is remarkable how far-reaching the positivity of X will be: indeed, a first look at the nonlinear Kolmogorov equation already reveals that for increasing and convex functions the supremum in Eq. 17 will be attained by the upper bounds of the coefficients.

The Fourier-transform which is lacking monotonicity (as eiux does) turns out to lose tractability in the nonlinear case considered here. The Laplace transform does not suffer from this problem and we show in the following that in important special cases it can be computed explicitly. However, inversion techniques can no longer be applied in the nonlinear setting and the Laplace transform merely serves as a prime example of an increasing convex function for which the nonlinear expectations can be computed explicitly in special cases (and for decreasing concave functions of course).

Remark 3

In principle, the nonlinear CIR model can be extended to the whole space, i.e., choosing \({\mathscr {O}}=\mathbb {R}\) is possible, since in Eq. 4, negative values pose no problem. For x negative, the dynamics of the model have no diffusive part and only move through the drift (which of course can still be stochastic). This could happen with a positive (upper) mean-reversion level and by starting from a negative value.

We begin with a general result for affine processes which classifies when the classical representation via Riccati equations still holds.

Recall from (5), that b(x) and a(x) are actually intervals. Define the upper bound of a(x) by

$$ \begin{aligned} \bar a(x) & := \bar a^{0} + \bar a^{1} x^{+}. \end{aligned} $$

Note that the upper bound of b(x) is given by

$$ \begin{aligned} \bar b(x) & := \bar b^{0} + \underline b^{1} x {\mathbbm{1}}_{\{{x <0}\}} + \bar b^{1} x {\mathbbm{1}}_{\{{x \ge 0}\}}. \end{aligned} $$

Letting \( \bar B^{1,x} = \underline b^{1} {\mathbbm{1}}_{\{{x <0}\}} + \bar b^{1} {\mathbbm{1}}_{\{{x \ge 0}\}}, \) we obtain that \(\bar b(x) = \bar b^{0} + \bar B^{1,x} x.\) Note that for xy we may no longer have that \(\bar b(x) = \bar b^{0} + \bar B^{1,y} x\).

Proposition 5

Consider a nonlinear affine process \({\mathcal {A}}(x,\Theta)\) and assume that for all \(P \in {\mathcal {A}}(x,\Theta)\)

$$ \begin{aligned} \beta^{P}_{t} \le \bar b^{0} + \bar B^{1,x} X_{t}, \end{aligned} $$

dPdt-almost everywhere for 0≤tT. Moreover, assume either that \(\underline a_{1} = \bar a_{1} = 0\) or that for all \(P\in {\mathcal {A}}(x,\Theta)\), Xt≥0Pdt-a.e. If there exists a \(\bar P\in {\mathcal {A}}(x,\Theta)\)and a \(\bar P\)-\(\mathbb {F}\)-Brownian motion W such that the canonical process under \(\bar P\) is the unique strong solution of

$$ \begin{aligned} {dX}_{t} = (\bar b^{0} + \bar B^{1,x} X_{t})dt + \sqrt{ \bar a(X_{t})} {dW}_{t}, \qquad X_{0}=x, \end{aligned} $$

then, for all u≥0 and 0≤tT,

$$\sup_{P \in {\mathcal{A}}(x,\Theta)}E^{P}\left[e^{{uX}_{t}}\right] = \exp\left(\phi(t,u) + \psi(t,u) x \right), $$

where ϕ, ψ solve the Riccati equations

$$ \begin{aligned} &\partial_{t} \phi(t,u) = \frac{1}{2} \bar a^{0} \psi(t,u)^{2} + \bar b^{0} \psi(t,u), \qquad \phi(0,u)= 0\\ \end{aligned} $$
$$ \begin{aligned} &\partial_{t} \psi(t,u) = \frac{1}{2} \bar a^{1} \psi(t,u)^{2} + \bar B^{1,x} \psi(t,u), \qquad \psi(0,u)= u. \end{aligned} $$

This result is obtained by showing that the supremum in the nonlinear expectation is obtained by the maximal semimartingale law \(\bar P\) which corresponds to an affine process with parameters \(\bar a^{0}, \bar a^{1}, \bar b^{0}, \bar B^{1,x}\). Inspection of the proof shows that this property also holds when eux is replaced by any other increasing and convex function (a Call payoff, for example). An analogous formulation for u<0 of course also holds. Furthermore, it is interesting to see that the Riccati equations in Eqs. (37) and (38) can be replaced by respective versions thereof.

Note that the assumption Xt≥0Pdt-a.e. is implied by assuming x>0 together with \(\underline a^{0}=\bar a^{0}=0\) and \(\underline b^{0} \ge \nicefrac {\bar a^{1}} 2>0\) due to Proposition 1.


The claim follows by an application of Theorem 2.2 in Bergenthum and Rüschendorf (2007). This needs validity of the so-called propagation of order (PO) property and we give a detailed account of this in Appendix 1. Proposition 11 in particular yields that the PO property is satisfied for increasing and convex (decreasing and concave) functions when compared to an affine process.

Let \(P\in {\mathcal {A}}(x,\Theta)\). By assumption, there exists a \(\bar P \in {\mathcal {A}}(x,\Theta)\) and a \(\bar P\)-\(\mathbb {F}\)-Brownian motion W, such that X is the unique strong solution of the stochastic differential Eq. 36 under \(\bar P\). Since eux for u≥0 is increasing and convex and (35) holds, Theorem 2.2 in Bergenthum and Rüschendorf (2007) yields that

$$E^{P}\left[e^{{uX}_{t}}\right] \le E^{\bar P}\left[ e^{{uX}_{t}}\right]. $$

Since \(P\in {\mathcal {A}}(x,\Theta)\) was arbitrary and, under \(\bar P\), X is an affine process with parameters \( \bar a^{0}, \bar a^{1}, \bar b^{0}, \bar B^{1,x}\), the affine representation follows and the Riccati equations in (37) can be obtained directly from Theorem 10.1 in Filipović (2009). □

Remark 4

It can be easily checked that the conditions for Proposition 5 hold in the following two cases:

  1. (1)

    The nonlinear Vasiček model with state space \({\mathscr {O}}=\mathbb {R}\) and \(\bar b^{1} = \underline b^{1}\),

  2. (2)

    the nonlinear CIR model on the state space \({\mathscr {O}}=\mathbb {R}_{>0}\).

By the above result, we can use the classical Fourier-inversion technique for these affine processes for the pricing of increasing and convex payoffs (like Call options) or decreasing and concave ones, see Section 10.3 in Filipović (2009) for examples and details in this direction. In Example 4, we will sketch an application in a Heston model with parameter uncertainty in the stochastic volatility.

5 An Itô formula for nonlinear affine processes

In this section, we will construct new processes from nonlinear affine processes by simple transformations. The main tool for this will be a suitable formulation of the Itô-formula in our setting.

Consider a twice continuously differentiable function \(F \in C^{2}(\mathbb {R})\). If we start from a nonlinear affine process \({\mathcal {A}}(x,\Theta)\) and consider \(\tilde X:=F(X)\), then for any \(P\in {\mathcal {A}}(x,\Theta)\) the process \(\tilde X\) is a P-semimartingale and we denote its (differential) semimartingale characteristics by \(\tilde \alpha \) and \(\tilde \beta ^{P}\) (starting from α and βP from Equality (2)). In this section, we answer the question if the nonlinear process \(\tilde X\) itself, i.e., the associated semimartingale laws can be studied independently of X. This corresponds one-to-one to the question if there exists an independent formulation of the nonlinear process \(\tilde X\). The following proposition gives a positive answer to this question.

We define the interval-valued functions aF and bF by

$$ \begin{aligned} a^{F}(x)& := \left[\left(F^{\prime}(x)\right)^{2} \left(\underline a^{0} + \underline a^{1} x^{+}\right), (F^{\prime}(x))^{2} \left(\bar a^{0}+\bar a^{1} x^{+}\right)\right] \end{aligned} $$


$$ {\begin{aligned} b^{F}(x) &:=\left[ \inf_{(\beta,\alpha) \in b^{*}(x)\times a^{*}(x)} \left(F^{\prime}(x) \beta + {\frac{1}{2}} F^{\prime\prime}(x) \alpha\right), \sup_{(\beta,\alpha) \in b^{*}(x)\times a^{*}(x)} \left(F^{\prime}(x) \beta + {\frac{1}{2}} F^{\prime\prime}(x) \alpha\right) \right]. \end{aligned}} $$

The nonlinear process \(\tilde X\) inherits certain bounds from X which are characterized in the following proposition.

Proposition 6

Let \(\mathcal {A}(x,\Theta)\)be a nonlinear affine process and FC2. Then, for every \(P\in {\mathcal {A}}(x,\Theta)\), \(\tilde X=F(X)\) is a P-semimartingale with differential characteristics \(\tilde \alpha \) and \(\tilde \beta ^{P}\) satisfying

$$\begin{array}{*{20}l} \tilde \alpha_{s} & \in & a^{F}(X_{s}), \end{array} $$
$$\begin{array}{*{20}l} \tilde \beta^{P}_{s} & \in & b^{F}(X_{s}). \end{array} $$


Let t[0,T]. By definition, \(P \in \mathcal {A}(x,\Theta)\) implies that

$$X_{t}= X_{0} + \int_{0}^{t}\beta_{s}^{P} ds + M_{t}^{P}, $$

where \(\beta _{s}^{P} \in b^{*}(X_{s})\), αsa(Xs), and MP is the continuous local martingale part in the P-semimartingale decomposition of X. As previously, we denote \(\left \langle M^{P} \right \rangle =\int _{0}^{\cdot } \alpha _{s} ds\). Since \(F\in C^{2}(\mathbb {R})\), the Itô formula yields that

$$F(X_{t})= F(X_{0}) + \int_{0}^{t} \left(F^{\prime}(X_{s}) \beta_{s}^{P} + \frac{1}{2} F^{\prime \prime}(X_{s})\alpha_{s}\right) ds + \int_{0}^{t} F^{\prime}(X_{s}) dM^{P}_{s} $$

and, hence,

$$ \begin{aligned} \tilde{\beta}^{{P}}_{s} &= F^{\prime}(X_{s}) \beta^{P}_{s} + \frac{1}{2}F^{\prime \prime}(X_{s})\alpha_{s} \\ \tilde{\alpha}_{s} &= \left(F^{\prime}(X_{s})\right)^{2} \alpha_{s}. \end{aligned} $$

Now, the properties \(\tilde \alpha _{s} \in a^{F}(X_{s})\) and \(\tilde \beta ^{P}_{s} \in b^{F}(X_{s})\) can be checked directly. □

Remark 5

Intuitively, the above result allows to construct the nonlinear process \(\tilde X=F(X)\) when X is nonlinear affine. The new bounds for the (differential) semimartingale characteristics are given by bF(X) and aF(X), respectively. However, the drift and volatility of \(\tilde X\) now relate to each other, which often gives a substantially smaller class in comparison to all semimartingale laws whose drift and volatility stay in bF(Xt) and aF(Xt).

In general, (nonlinear) affine processes are stable under affine transformation. The following example shows that, we may even consider the nonlinear transformation F(x)=x2, at least in some special cases.

Example 1

Let \({\mathcal {A}}(x,\Theta)\)be a nonlinear Vasiček model satisfying \(\bar b^{0}=\underline b^{0} =0\), and \(\tilde X = F(X)=X^{2}\). We apply Proposition 6: first, note that since F′′=2>0,

$${b}^{F}(x) = \left[2x^{2}\underline{b}^{1}+ \underline{a}^{0}, 2x^{2}\bar{b}^{1}+ \bar{a}^{0}\right], $$

and \({a}^{F}(x)= \left [4 x^{2} \underline {a}^{0}, 4x^{2} \bar {a}^{0} \right ]\). Then, bF and aF can even be written as functions of \(\tilde X=X^{2}\). This would not be the case if \(\bar b^{0}=\underline b^{0} =0\) does not hold, since bF would depend on x (which is not a function of x2). Under this observation, we may directly study the semimartingale characteristics of \(\tilde X\). Replacing x2 by \(\tilde X\) in bF and aF we indeed observe an affine structure and it is tempting to conjecture that we obtained a nonlinear CIR model. In general, this is not the case: for simplicity, choose \(\underline a^{0} = \underline b^{1} = 0\) and \(\bar a^{0}=\bar b^{1} = 1\) and x=1. Then bF(1)=[0,3] and aF(1)=[0,4]. For a nonlinear CIR model, any choices of \((\tilde \beta, \tilde \alpha)\)in bF×aF should be possible. Now choose, say, \(\tilde \alpha =4\) (corresponding to a maximal volatility of α=1 in the original model). Then not all choices of \(\tilde \beta \in [0,3]\) are reached by the original model: indeed, one immediately obtains from (43) that \(\tilde \beta \) needs to lie in [1,3].

In the choice where only one parameter (either α or β) carries uncertainty, this problem of course vanishes. This is the case for the existing transformations of g– and G–Brownian motion in the literature and we provide further examples in this direction below.

The above example also illustrates that nonlinear transformations of processes under ambiguity should be handled with care. The following example shows how to obtain a geometric kind of dynamics, which allows us to obtain the nonlinear Black–Scholes model as considered in (Epstein and Ji (2013), Example 3) and Vorbrink (2014). Both works consider the case where there is only volatility uncertainty.

Example 2

Let \({\mathcal {A}}(x,\Theta)\)be a nonlinear affine process and consider F(X)=eX. Again, we apply Proposition 6. First, note that with \(\tilde x = e^{x}\),

$$a^{F}(x) = (e^{x})^{2} a^{*}(x). $$

Moreover, since \(a^{*}(x) = \left [\underline a^{0}+\underline a^{1} x^{+}, \bar a^{0}+ \bar a^{1} x^{+}\right ]\), we obtain

$$ \begin{aligned} a^{F}(x) = (\tilde x)^{2}\left[\underline a^{0}+\underline a^{1} (\log \tilde x)^{+}, \bar a^{0}+ \bar a^{1} (\log \tilde x)^{+}\right] = \tilde a(\tilde x), \end{aligned} $$

and we already computed \(\tilde a\). Similarly, one obtains \(\tilde b\) from (40) noting that

$${\begin{aligned} {b}^{F}(x) = \left\{\!\!\!\begin{array}{ll} &\left[e^{x}\left(\underline b^{0} + \underline b^{1} x\right)+ \frac{1}{2}e^{x}\left(\underline a^{0} + \underline a^{1} x^{+}\right), e^{x} \left(\bar b^{0} + \bar b^{1} x\right)+ \frac{1}{2}e^{x} \left(\bar a^{0} + \bar a^{1} x^{+}\right)\right], x \ge 0 \\ &\left[e^{x}\left(\underline b^{0} + {\bar b^{1}} x\right)+ \frac{1}{2}e^{x} \underline a^{0}, e^{x} \left(\bar b^{0} + {\underline b^{1}} x\right)+ \frac{1}{2}e^{x} \bar a^{0} \right], x < 0. \end{array}\right. \end{aligned}} $$

The state space of eX is of course \(\mathbb {R}_{>0}\).

Example 3

(The nonlinear Black–Scholes model) Allowing for drift and volatility uncertainty in the log-price of a stock, one arrives at a nonlinear Black–Scholes model. We consider a Brownian motion with drift and volatility uncertainty, which is in our language a nonlinear Vasiček model with \(\underline b^{1}=\bar b^{1} = 0\). Furthermore, we assume that the stock price is given by S= exp(X), i.e., F(x)=ex. Then, the calculations from the previous example immediately yield that the stock price is given by the nonlinear process \(\tilde X\), where \(a^{F}(x)=\tilde a(F(x))\) and \(b^{F}(x)=\tilde b(F(x))\) with

$$\tilde a(x) = x^{2} \left[ \underline a^{0}, \bar a^{0}\right] $$


$$\tilde b(x) = \left[ x \underline b^{0} + \tfrac 1 2 x \underline a^{0}, x \bar b^{0} + \tfrac 1 2 x \bar a^{0}\right]. $$

Option pricing for monotone convex (concave) payoffs can immediately be done by Proposition 5, see Example 3 in Epstein and Ji (2013) for explicit formulae for Call options (with no uncertainty of the drift). The article (Vorbrink 2014) excludes drift uncertainty by arguing that under risk-neutral pricing the drift is known.

Example 4

(The Heston model with uncertainty in the volatility parameters)

The model put forward in Heston (1993) is one of the most popular models for stochastic volatility, which also is heavily used in foreign exchange markets. Model and calibration risk is an important issue, see, for example, Guillaume and Schoutens (2012). Here, we give a short outline of how a nonlinear version could be constructed, allowing for parameter uncertainty in volatility only (and not in the drift of the stock price or in the correlation of volatility and stock price). In this regard, we extend Ω in the classical way to construct an additional (independent) Brownian motion \(\tilde V\) which allows us to construct two correlated Brownian motions V and W. The correlation is fixed and denoted by ρ. Each \(P\in \mathfrak {P}(\Omega)\) is extended by leaving \(\tilde V\) untouched, such that (V,W) will be a two-dimensional Brownian motion where V and W have correlation ρ and we denote this new semimartingale law again by P.

Consider a nonlinear CIR process \({\mathcal {A}}(x,\Theta)\) with state space \({\mathscr {O}}=\mathbb {R}_{>0}\) as introduced in Section 4.3. The stock price S is given by the strong solution of the SDE

$${dS}_{t} = S_{t} X_{t} {dV}_{t}, \qquad 0 \le t \le T, $$

where X is the canonical process on Ω (and hence a nonlinear process). Hence, the volatility X stems from a nonlinear CIR model which means intuitively, that we have a CIR model with parameter uncertainty with upper and lower bounds \(\bar b^{0}, \bar b^{1}, \bar a^{1}\) and \(\underline b^{0}, \underline b^{1}, \underline a^{1}\), respectively. For simplicity, we chose a vanishing risk-free rate of interest.

We show how to compute a Call-price in this nonlinear Heston model in the following. The Call price C(T,K)for maturity TT and strike K>0 is given by the supremum of the expectations EP[(STK)+] over all (extended) semimartingale laws P from \({\mathcal {A}}(x,\Theta)\). Since the payoff function (sK)+ is increasing and convex, the arguments of Proposition 5 apply and

$$C(T,K) = E^{\bar P}\left[\left(S_{T}-K^{+}\right)\right], $$

where \(\bar P\) is the worst-case semimartingale law which achieves the supremum. Again from the proof of Proposition 5, we find that under \(\bar P\), X is a (classical) CIR-process with parameters \(\bar b^{0}, \bar b^{1}, \bar a^{1}\). The Call price formula can be found in Heston (1993), see also Section 10.3.3. in Filipović (2009) for a derivation using Fourier inversion techniques.

6 Affine term structure models

One of the most important application of affine models is in term structure models. In this regard, we provide in the following a term-structure equation for nonlinear affine models implying prices for derivatives or bond-prices.

Consider a payoff f(XT) taking place at time T>0. In the classical setting, arbitrage-free prices are given by expectations of the discounted payoff under a risk-neutral measure. According to the superhedging duality in Biagini et al. (2017) (Theorem 5.1), in the case we consider here–when there is a family of such measures–upper bounds of these price processes (and hence the smallest superhedging price) given Xt=x are given by

$$ \begin{aligned} F(T-t,x) = \sup_{P \in {\mathcal{A}}(t,x,\Theta)}E^{P}\left[ e^{-\int_{t}^{T} X_{s} ds} f(X_{T})| X_{t}=x\right], \quad 0 \le t \le T. \end{aligned} $$

The following result states the nonlinear term-structure equation for the payoff f(XT).

Proposition 7

Assume that f is Lipschitz-continuous. Then, F(t,x) is a viscosity solution of

$$ \begin{aligned} \partial_{t} F(t,x) - \sup_{\theta \in \theta} {\mathscr{L}}^{\theta} F(t,x) + x F(t,x) &=0, \end{aligned} $$

with boundary condition F(0,x)=f(x). If, in addition,

  1. (i)

    \(\underline a^{0} >0, {\mathscr {O}}=\mathbb {R}\), and f is bounded, then F(t,x) is the unique solution of (46), or

  2. (ii)

    if \(\underline a^{0}=\bar a^{0}=0, \underline b^{0} \ge \nicefrac {\bar a^{1}} 2>0\) and \({\mathscr {O}}=\mathbb {R}_{>0}\), then F(t,x) is the only viscosity solution, such that

    $$ \begin{aligned} \sup_{(t,x) \in [0,T]\times \mathbb{R}_{>0}} \frac{|F(t,x)|}{1+x} < \infty. \end{aligned} $$

For a proof of this result, one can argue the same way as in the proof of Theorem 1. More precisely, dynamic programming yields for any stopping time τ taking values in [t,T] that

$$F(T-t,x) = \sup_{P \in {\mathcal{A}}(t,x,\Theta)}E^{P}\left[ e^{-\int_{t}^{\tau} X_{s} ds} F(T-\tau,X_{\tau})\right]. $$

Then, following the arguments in Theorem 1 leads to the desired result. Alternatively, one could also enlarge the state space to transform this control problem which is in Lagrange form to one in the Mayer form like in Proposition 2 and Theorem 1 (see, for example, Remark 3.10 in Bouchard and Touzi (2011)).

The term-structure equation now allows to obtain the bond prices by considering the payoff f(XT)=1. We illustrate how an extension of the state space can be used to achieve a result similar to Proposition 5 leading to closed-form bond prices in special cases.

In our approach, upper bond prices under the nonlinear affine term structure model \({\mathcal {A}}(t,x,\Theta)\), \(x \in {\mathscr {O}}\), are given by

$$ \begin{aligned} \bar p(t,T,x) = \sup_{P \in {\mathcal{A}}(t,x,\Theta)}E^{P}\left[ e^{-\int_{t}^{T} X_{s} ds} | X_{t}=x\right], \quad 0 \le t \le T, \end{aligned} $$

conditional on Xt=x. For arbitrary ωΩ, one obtains the bond price as \(\bar p(t,T)(\omega) = \bar p(t,T,\omega _{t})\). The respective lower bond price \(\underline p(t,T,x)\) is obtained by replacing the supremum with an infimum. The following proposition shows that in important special cases these prices can be obtained in closed form. Again, for \(x \in {\mathscr {O}}\) recall from (33) and (34) the upper bound of a(x) and b(x) defined by

$$\begin{aligned} \bar a(x) := \bar a^{0} + \bar a^{1} x^{+}, \quad \quad \bar b(x) := \bar b^{0} + \underline b^{1} x {\mathbbm{1}}_{\{{x <0}\}} + \bar b^{1} x {\mathbbm{1}}_{\{{x \ge 0}\}}, \end{aligned} $$

and recall \( \bar B^{1,x} = \underline b^{1} {\mathbbm{1}}_{\{{x <0}\}} + \bar b^{1} {\mathbbm{1}}_{\{{x \ge 0}\}}\), so we obtain that \(\bar b(x) = \bar b^{0} + \bar B^{1,x} x\). Moreover, recall from (36) the affine process with coefficients \(\bar a^{0}, \bar a^{1}, \bar b^{0}, \bar B^{1,x}\).

Proposition 8

Consider a nonlinear affine process \({\mathcal {A}}(x,\Theta)\), assume for all \(P \in {\mathcal {A}}(x,\Theta)\) that

$$ \begin{aligned} \beta^{P}_{t} \le \bar{b}^{0} + \bar B^{1,x} X_{t} \end{aligned} $$

dPdt-almost everywhere for 0≤tT and assume that either \(\underline a^{1} = \bar a^{1} = 0\) or that for all \(P\in {\mathcal {A}}(x,\Theta)\), Xt≥0Pdt-a.s. If there exists \(\bar P\in {\mathcal {A}}(x,\Theta)\) and a \(\bar P\)-\(\mathbb {F}\)-Brownian motion W such that the canonical process under \(\bar P\) is the unique strong solution of (36), then, for all u≥0 and 0≤tT,

$$\bar p(t,T,x) = \exp\left(\phi(t,u) + \psi(t,u) x \right), $$

where ϕ, ψ solve the Riccati equations

$$\begin{array}{*{20}l} \partial_{t} \phi(t,u) &= \frac{1}{2} \bar a^{0} \psi(t,u)^{2} + \bar b^{0} \psi(t,u), \ \,\qquad \phi(0,u)= 0, \end{array} $$
$$\begin{array}{*{20}l} \partial_{t} \psi(t,u) &= \frac{1}{2} \bar a^{1} \psi(t,u)^{2} + \bar B^{1,x} \psi(t,u)-1, \qquad \psi(0,u)= u. \end{array} $$


This result also follows by using semimartingale comparison. In this regard, let \(P \in {\mathcal {A}}(x,\Theta)\) and consider the two-dimensional process Y=(Y1,Y2), where \(Y^{1} = -\int _{0}^{\cdot } X_{s} ds\) and Y2=X. Then, there is no parameter uncertainty with respect to the dynamics of Y1 since its differential semimartingale characteristics are obtained from \(dY^{1}_{t} =- X_{t} dt\).

By assumption there exists a \(\bar P\in {\mathcal {A}}(x,\Theta)\) and a \(\bar P\)-\(\mathbb {F}\)-Brownian motion W, such that X is the unique strong solution of the stochastic differential Eq. 36. Denote by β,α the differential semimartingale characteristics under P of the two-dimensional semimartingale Y and by \(\bar \beta, \bar \alpha \) those of Y under \(\bar P\). It is easily verified that \(\beta _{t} \le \bar \beta _{t}\) and also that \(\alpha _{t} \le _{\text {psd}} \bar \alpha _{t}\) in the positive semidefinite orderFootnote 3.

Since \(\phantom {\dot {i}\!}(y^{1},y^{2}) \mapsto e^{u^{1} y^{1}}\) for u≥0 is increasing and convex, Theorem 2.2 in Bergenthum and Rüschendorf (2007) (the propagation of order (PO) property is shown in Appendix 1) yields that,

$$E^{P}\left[e^{-\int_{0}^{t} X_{s} ds}\right] \le E^{\bar P}\left[e^{-\int_{0}^{t} X_{s} ds}\right]. $$

Since, \(P\in {\mathcal {A}}(x,\Theta)\) was chosen arbitrarily and under \(\bar P\), X is an affine process with parameters \(\bar a^{0}, \bar a^{1}, \bar b^{0}, \bar B^{1,x}\), the affine representation follows, and the Riccati equations in (50)–(51) can be obtained directly from Theorem 10.4 in Filipović (2009). □

Remark 6

Again, as in Remark 4, it can be easily checked that the conditions for Proposition 8 hold in the following two cases:

  1. (1)

    The nonlinear Vasiček model with state space \({\mathscr {O}}=\mathbb {R}\) and \(\bar b^{1} = \underline b^{1}\),

  2. (2)

    the nonlinear CIR model on the state space \({\mathscr {O}}=\mathbb {R}_{>0}\).

It is remarkable, that in the general nonlinear Vasiček model with parameter uncertainty on the speed of mean reversion, the classical exponential affine bond pricing formula ceases to hold. Thus, in this model, the interval of parameters cannot directly be backed out from a interpolation of bid and ask prices with a standard Vasiček model. This, however, is the case in the CIR model and in the Vasiček model where b1 is known.

The previous results and examples directly allow the treatment of term-structure models based on the nonlinear Vasiček model and on the nonlinear CIR model (see Sections 4.2 and 4.3). An important difficulty for the modeller in practical situations are the strong implications that arise from the choice of a model: for example, the state space can allow for negative values or can strictly exclude them, a well-documented difficulty of the affine models after the European crisis, see Carver (2012). The following example shows that in a nonlinear setting one is able to mix these two types of models and the modeller no longer has to decide a priori if she allows or excludes negative values.

6.1 The nonlinear Vasiček–CIR model

If one is not able to restrict the state space a priori, one can consider the following nonlinear affine model: assume that both parameters a0 and a1 are subject to parameter uncertainty (or at least one of them with the other parameter not vanishing). Intuitively, this means that the model may switch between a Vasiček-like or CIR-like behaviour. In particular, when one is not able to restrict the state space a priori to \(\mathbb {R}_{\ge 0}\), this nonlinear model allows to incorporate both model approaches in a robust (i.e., nonlinear) sense.

In the interest rate markets in the early 2000s, market participants believed in positive interest rates and thus favoured the CIR-model. The credit crises led to decreasing interest rates and the Vasiček model came back, as it allows for negative interest rates. This effect is well known and its implications for banking are quite important, see, for example, Carver (2012); Patel et al. (2018); Orlando et al. (2016), and Russo and Fabozzi (2017). In the near future, however, when interest rates rise, one could be interested in deviating from the Vasiček model again. With the nonlinear Vasiček–CIR model, such a switch is no longer necessary and one is able to behave consistently through such seemingly different time periods.

More precisely, assume that \(\underline a^{0} >0\) and consider the state space \({\mathscr {O}}=\mathbb {R}\). Uniqueness for the nonlinear PDE (16) follows readily from Proposition 3. In this general case, there are no explicit solutions, as, e.g., in Proposition 8 above, and we must rely on numerical techniques.

Figure 2 illustrates the differences in option pricing of the nonlinear Vasiček–CIR model to both the Vasiček and the CIR model. While in the nonlinear CIR model, this price can still be computed explicitely, this no longer holds for the other two models. The parameters of the plot are chosen to best illustrate this. The plot shows the (upper) price of a Call option under a nonlinear affine model with the parameters stated in the plot. While the nonlinear Vasiček model clearly puts weight on \(\mathbb {R}_{<0}\), the CIR model shifts the mass towards the right, leading to higher option values for x>0.3.

Fig. 2
figure 2

This figure shows the upper Call price for the nonlinear Vasiček, the nonlinear CIR, and the nonlinear Vasiček–CIR model. The first two models are obtained from the latter by simply letting \(\bar a^{1}=0\) (\(\bar a^{0} = 0\), respectively). The Call price has a strike of 0.5 and the parameters are given in the table above

The graph further highlights the mentioned difficulty of the modeller to choose from one model: starting from positive values (i.e., here x>0.3), he possibly would prefer to choose the (classical/nonlinear) CIR model, while for falling values prices become prohibitively low, illustrating the need to switch to the Vasiček model. The nonlinear Vasiček–CIR model clearly does not suffer from this problem (it is clear by definition that the parameters of the nonlinear Vasiček–CIR model can be chosen to reproduce the option prices of the two other models).

7 Model risk

In financial applications, model risk is an important factor for risk management. In the remarkable work (Cont 2006), a systematic framework for the management of model risk has been proposed which we recall briefly and thereafter apply to the nonlinear affine models. The importance of this topic is illustrated by the intensive research in this area, see, for example, Bannör and Scherer (2013); Guillaume and Schoutens (2012); Breuer and Csiszár (2016); Barrieu and Scandolo (2015), and da Fonseca and Grasselli (2011), among many others.

In this approach, the market contains a number of benchmark instruments which are liquidly traded instruments and the observation consists of the bid and ask prices thereof. Moreover, there is a set of arbitrage-free pricing models \({\mathcal {Q}}\) which is consistent with the observations of the benchmark instruments.

In our framework, both can be described through a nonlinear affine model: the nonlinear affine model specifies a set \({\mathcal {Q}}\) of pricing measures, as in the nonlinear affine term-structure approach studied in Section 6. Consistent bid and ask prices can be obtained by suprema and infima over these pricing measures, exactly as it was done for \(\underline p(t,T)\) and \(\bar p(t,T)\) in the previous section.

A coherent measure of model uncertainty for a payoff function \(\psi :\mathbb {R} \to \mathbb {R}\) with ψ(XT) denoting the payoff at time T can be computed from the upper and lower price bounds

$$ \begin{aligned} \bar \pi(\psi) = \sup_{Q \in {\mathcal{Q}}}E^{Q}[\psi(X_{T})], \qquad \underline \pi(\psi) = \inf_{Q \in {\mathcal{Q}}}E^{Q}[\psi(X_{T})]. \end{aligned} $$

The measure of model uncertainty on the derivative ψ is given by

$$\mu_{{\mathcal{Q}}}(\psi) = \bar \pi(\psi) - \underline \pi(\psi). $$

Some examples are provided in Cont (2006), including the nonlinear Black–Scholes model. We illustrate the application of nonlinear affine models in this framework with a short study of model risk in the nonlinear Vasiček model.

7.1 The nonlinear Vasiček model

As an example, we consider the nonlinear Vasiček–model introduced in Section 4.2. Recall that this model is characterized by the assumption that \(\left [\underline a^{1},\bar a^{1}\right ]=\{0\}\). In the nonlinear case, nonlinear expectations are given by the solution of the nonlinear Kolmogorov Eq. 15. Proposition 5 allows to trace this solution back to existing solutions for affine models if the payoffs are increasing and convex (decreasing and concave), see Example 4. For more general payoffs we rely on numerical methods which we illustrate now.

7.1.1 Options

The model risk for options is illustrated in Fig. 3, where we price a Call and a Butterfly. To construct the set Θ, we take estimated parameter values together with their 95% confidence intervals from the literature. The result is shown in Fig. 3. While for the Call option the model risk increases monotonically with the initial value, the maximal model risk for the butterfly is attained for the initial value x directly at the maximal payoff.

Fig. 3
figure 3

This figure shows the solution of the nonlinear Kolmogorov equation for the nonlinear Vasiček-model with boundary condition f(x)=(xK)+ and f(x)=(xK1)+−2(xK2)++(xK3)+,K=0.1,K1=−0.2,K2=0.3, and K3=0.8. For each of the figures, the dashed lines show the solution of the Vasiček model with no uncertainty. The upper and lower solid lines show the upper and lower price bounds. The parameters for the computation are given in the table below the figures

8 Conclusion

In this paper, we introduced affine processes under uncertainty. This extends the existing class of nonlinear Lévy processes to Markov processes where the interval for the parameter uncertainty may depend on the current state (in an affine way, however). We obtained a dynamic programming principle implying a nonlinear Kolmogorov equation which can be used to price options in a fast and efficient way. Many existing models can be embedded into a setting with parameter uncertainty which we illustrate with a number of examples. However, the nonlinear framework also allows for new model variations which did not exist in the classical approach and we illustrate this with a term-structure Vasiček–CIR model, where the modeller does not need to decide a priori if the state space should include negative rates or not, a strong restriction in existing models. The generalization to higher dimensions or to the case with jumps is left for future research. Here, we concentrated on the conceptual introduction of state-dependent parameter uncertainty and chose the simplest but still highly interesting example for the illustration of our ideas.

9 Appendix 1: Semimartingale comparison

In this section, we recall the main result in Bergenthum and Rüschendorf (2007) on comparison of semimartingales with Markov processes and show that the crucial propagation of order (PO) property is satisfied for a large class of Markov processes, in particular for affine processes. To the best of our knowledge, existing results in the literature require Lipschitz assumptions on the coefficient, which will not hold in our case.

As function class \({\mathcal {F}}\) we will consider increasing and convex functions. For a real-valued Markov process S and a terminal time T we define the propagation operator

$${\mathcal{G}}^{g}(t,x) = E\left[g(S^{*}_{T})|S^{*}_{t}=x\right]. $$

Assumption 1

For some function class \({\mathcal {F}}\) and some Markov process S, we say that PO\((S^{*},{\mathcal {F}})\) holds if \({\mathcal {G}}^{g}(t,\cdot) \in {\mathcal {F}}\) for all 0≤tT and for all \(g \in {\mathcal {F}}\).

Propagation of monotonicity and convexity follows in a very elegant way through total positivity of the transition densities of continuous Markov processes.

Proposition 3.1 in Kijima (2002) immediately yields the following result.

Proposition 9

Assume that S is a strong Markov process having continuous sample paths and that g is increasing (decreasing). Then, \({\mathcal {G}}^{g}(0,\cdot)\) is increasing (decreasing).

For the propagation of convexity we need an additional step, because the considered processes in Kijima (2002) are in fact martingales. The proof crucially uses the variation-diminishing property of totally positive functions.

Proposition 10

Assume that S is a strong Markov process with state space S having continuous sample paths and that there exist \(\pi _{0},\ \pi _{1} \in \mathbb {R}\) with π1≠0, such that

$$ \begin{aligned} E\left[S^{*}_{T}|S^{*}_{0}=x\right]=\pi_{0} + x \pi_{1}, \qquad x \in S. \end{aligned} $$

Then, for convex (concave) functions g it holds that \({\mathcal {G}}^{g}(0,\cdot)\) is convex (concave).


We modify the first step in Proposition 3.2 in Kijima (2002). In this regard, note that Eq. 53 yields that

$$ \begin{aligned} x &= \frac{E\left[S^{*}_{T}|S^{*}_{0}=x\right] - \pi_{0}}{\pi_{1}}. \end{aligned} $$

Moreover, we follow the notation in Kijima (2002) and denote by qT(x,y) the transition density of the Markov process S, i.e.,

$$ \begin{aligned} {\mathcal{G}}^{g}(0,x) = E\left[g\left(S^{*}_{T}\right)|S^{*}_{0}=x\right] = \int_{S} g(y) \, q_{T}(x,y) dy. \end{aligned} $$

Let \(\tilde \alpha _{1},~\tilde \alpha _{2} \in \mathbb {R}\). From (54) and (55), we obtain that

$$\begin{aligned} {\mathcal{G}}(0,x) - \tilde \alpha_{1} x - \tilde \alpha_{2} & =\int_{\mathbb{R}} q_{T}(x,y) \left\{ g(y)- \tilde \alpha_{1} x - \tilde \alpha_{2}\right\} dy \\ &= \int_{\mathbb{R}} q_{T}(x,y) \left\{ g(y)- \frac{\tilde \alpha_{1}}{\pi_{1}} y + \frac{\tilde \alpha_{1}\pi_{0}}{\pi_{1}} -\tilde \alpha_{2}\right\} dy. \end{aligned} $$

Since \( \frac {\tilde \alpha _{1}}{\pi _{1}} y + \frac {\tilde \alpha _{1}\pi _{0}}{\pi _{1}} -\tilde \alpha _{2}=: \alpha _{1} y + \alpha _{2} \) is an affine function of y, the claim follows by repeating the arguments in Proposition 3.2 in Kijima (2002). □

Consider in addition a different Markov process S, possibly on a different probability space, and denote by

$$\begin{aligned} {\mathcal{F}}_{icx}:=\{ f:\mathbb{R} \to \mathbb{R}, \text{increasing and convex} \}. \end{aligned} $$

A combination of Proposition 9 and 10 yields the following result.

Proposition 11

Let S be a strong and homogeneous Markov process with continuous sample paths and that

$$\begin{aligned} E\left[S^{*}_{t}|S^{*}_{0}=x\right]=\pi_{0}(t) + x \pi_{1}(t), \qquad 0 \le t \le T \end{aligned} $$

holds with π1(t)≠0 for all t[0,T]. Then, PO\((S^{*},{\mathcal {F}}_{icx})\) holds.


Let \(g \in {\mathcal {F}}_{icx}\). First, Proposition 9 yields that \({\mathcal {G}}^{g}(0,\cdot)\) is increasing. As the choice of T was arbitrary, we obtain that also \({\mathcal {G}}^{g} (t,\cdot)\) is increasing by repeating the argument of Proposition 9 and using homogeneity of S.

Second, Proposition 10 yields that \({\mathcal {G}}^{g}(0,\cdot)\) is convex. Denote

$${\mathcal{H}}^{g}(t,x) = E[g(S^{*}_{t})|S^{*}_{0}=x]. $$

Then, \({\mathcal {G}}^{g}(t,x)={\mathcal {H}}^{g}(T-t,x)\) since the Markov process is homogeneous. Now, Proposition 10 yields that \({\mathcal {H}}^{g}(t,\cdot)\) is convex since the choice of T was arbitrary and the claim follows. □

We remark that for a continuous, one-dimensional affine process X with \(\left [\underline a^{1},\bar a^{1}\right ]=\{0\}\) or \({\mathscr {O}}=\mathbb {R}_{>0}\), affinity of the expectation in (53) is satisfied which follows directly from the exponential-affine structure of the Laplace transform \( E\left [e^{{uX}_{t}}|X_{0}=x\right ]=\exp (\phi (t,u)+\psi (t,u) x)\). Indeed, an application of Theorem 13.2 in Jacod and Protter (2004) yields that

$$E[X_{t}|X_{0}=x] = \partial_{u} \phi(t,0)+ x \partial_{u} \psi(t,0), $$

where we use the explicit expressions for ϕ and ψ from section 10.3.2 in Filipović (2009).

10 Appendix 2: Comparison results

In this section, we recall the comparison results from Amadori (2007) in our notation. Again, the crucial point for these results is that Lipschitz assumptions on the full domain do not hold. Note that we only consider the one-dimensional, time-homogeneous case here, which simplifies the matter significantly. While minimization is the core topic of Amadori (2007), the financial applications mainly treat maximization, such that we concentrate on the maximization. The stated results follow from the original results by replacing ψ with −ψ.

Fix the state space \({\mathscr {O}}=\mathbb {R}_{>0}\) and consider the controlled diffusion X=Xθ

$$ \begin{aligned} {dX}_{s} = b(X_{s},\theta_{s}) ds + \sqrt{a(X_{s},\theta_{s})} {dW}_{s}, \quad s>t \end{aligned} $$

with initial condition \(X_{t}=x \in {\mathscr {O}}\). Our application will be in Proposition 4, which considers the nonlinear CIR modell. Then, since \(\underline a^{0}=\bar a^{0}=0\), we consider \(\Theta = \left [\underline b^{0},\bar b^{0}\right ] \times \left [\underline b^{1},\bar b^{1}\right ] \times \{0\} \times \left [\underline a^{1}, \bar a^{1}\right ] \) and an adapted process (θs) taking values in Θ. Moreover, for each \(x\in {\mathscr {O}}, \theta \in \Theta, b(x,\theta)=b^{0}+b^{1}x\) and \(\sqrt {a(x,\theta)}=\sqrt {a^{1} x}\) which is not Lipschitz at 0.

Now, fix a continuous function \(\psi :{\mathscr {O}} \to \mathbb {R}\) and define the value function \(v:[0,T]\times {\mathscr {O}}\to \mathbb {R}\) by

$$v(t,x):=\sup_{(\theta_{s})} E_{x,t}[\psi (X_{T})], $$

where the supremum ranges over all adapted processes (θs) taking values in Θ, X is the controlled diffusion satisfying (57), and Et,x refers to the conditional expectation conditioning on Xt=x. The associated Hamilton–Jacobi–Bellman equation is given in (16).

First, Assumption 1 in Amadori (2007) holds. Indeed, note that the functions f and r therein equal to zero in our case, that the functions b and \(\sqrt {a}\) are Lipschitz-continuous on [ε,) for all ε>0, and all θΘ. Moreover, let \(\parallel \cdot \parallel _{C^{0,1}([\epsilon,\infty))}\) denote the LipschitzFootnote 4 coefficient on [ε,), then clearly

$$\sup \{ \parallel \sqrt{a(\cdot,\theta)} \parallel_{C^{0,1}([\epsilon,\infty))}: \theta \in \Theta \} < \infty,$$

and, similarily, for b and all conditions of Assumption 1 hold.

Second, Assumption 4 is implied by the Feller condition \(\underline b^{0} \ge \nicefrac {\bar a^{1}} 2>0\). Indeed, Assumption 4 requires that

$$\limsup_{x \to 0} \sup_{\theta \in \Theta}\left(\frac 1 x - \frac{2b(x,\theta)}{a(x,\theta)} \right) < \infty. $$

Since \(\underline b^{0} \ge \nicefrac {\bar a^{1}} 2> 0\) implies that

$$\begin{aligned} \frac 1 x - \frac{2b(x,\theta)}{a(x,\theta)} &= \frac 1 x \left(1- \frac{2b^{0}}{a^{1} }\right) -\frac{2 b^{1}}{a^{1}} \le -\frac{2 b^{1}}{a^{1}}, \end{aligned} $$

Assumption 4 also holds. In our notation, the uniqueness result following immediately from the comparison principle given in Theorem 4 in Amadori (2007) reads as follows. We refer to the Section 4 for the definitions of viscosity solutions, super- and subsolutions.

Theorem 2

Assume that \(\underline b^{0} \ge \nicefrac {\bar a^{1}} 2>0\). Let v be a continuous locally bounded viscosity solution of (16).

  1. (i)


    $$ \begin{aligned} \sup_{{\mathscr{O}} \times [0,T]} \frac{|v(t,x)|}{1 + x} < \infty. \end{aligned} $$

    Let u be a locally bounded viscosity solution satisfying (58), then u=v.

  2. (ii)


    $$ \begin{aligned} & \limsup_{x \to \infty} \sup_{t \in [0,T]} \frac{|v(t,x)|}{x^{2}} = 0, \quad {\lim}_{x \to 0} \sup_{t \in [0,T]} \frac{|v(t,x)|}{h_{\epsilon}(x)} = 0. \end{aligned} $$

    Let u be a locally bounded viscosity solution satisfying (59), then u=v.


Recall that a viscosity solution is a supersolution and a subsolution. Since u and v are both viscosity solutions, u=v holds on \({\mathscr {O}}\times \{T\}\). For (i), note that the conditions hold with γ=1 both for u and v, such that applying Theorem 4 in Amadori (2007) twice (once as a supersolution and once as a subsolution) yields u=v on \({\mathscr {O}} \times [0,T]\). The claim (ii) follows similarly. □

Change history

  • 26 August 2019

    An error occurred during the publication of an article in Probability, Uncertainty and Quantitative Risk. The article was published in volume 4 with a duplicate citation number.


  1. The weak topology is the topology induced by the bounded continuous functions on Ω. Then, \(\mathfrak {P}(\Omega)\) is a separable metric space and we denote the associated Borel σ-field by \({\mathscr {B}}(\mathfrak {P}(\Omega))\).

  2. This is because C can be constructed as a single process not depending on P; that is, two measures under which X has different diffusion are necessarily singular, see Proposition 6.6 in Neufeld and Nutz (2014), for the construction of C.

  3. Here, ApsdB means x(BA)x≥0 for all \(x \in \mathbb {R}^{2}\).

  4. This is the Hölder coefficient for exponent α with α=1, see section 5.1 in Evans (2012).


  • Acciaio, B, Beiglböck, M, Penkner, F, Schachermayer, W: A model-free version of the fundamental theorem of asset pricing and the super-replication theorem. Math. Finance. 26(2), 233–251 (2016).

    Article  MathSciNet  MATH  Google Scholar 

  • Amadori, AL: Uniqueness and comparison properties of the viscosity solution to some singular HJB, equations. Nonlinear Differ. Equ. Appl. NoDEA. 14(3-4), 391–409 (2007).

    Article  MathSciNet  MATH  Google Scholar 

  • Avellaneda, M, Levy, A, Parás, A: Pricing and hedging derivative securities in markets with uncertain volatilities. Appl. Math. Finance. 2(2), 73–88 (1995).

    Article  Google Scholar 

  • Bannör, KF, Scherer, M: Capturing parameter risk with convex risk measures. Eur. Actuar. J. 3(1), 97–132 (2013).

    Article  MathSciNet  MATH  Google Scholar 

  • Barrieu, P, Scandolo, G: Assessing financial model risk. Eur. J. Oper. Res. 242(2), 546–556 (2015).

    Article  MathSciNet  MATH  Google Scholar 

  • Bergenthum, J, Rüschendorf, L: Comparison of semimartingales and Lévy processes. Ann. Probab. 35(1), 228–254 (2007).

    Article  MathSciNet  MATH  Google Scholar 

  • Biagini, S, Bouchard, B, Kardaras, C, Nutz, M: Robust fundamental theorem for continuous processes. Math. Finance. 27(4), 963–987 (2017).

    Article  MathSciNet  MATH  Google Scholar 

  • Bielecki, TR, Cialenco, I, Rutkowski, M: Arbitrage-free pricing of derivatives in nonlinear market models. Probab. Uncertain. Quant. Risk. 3(1), 2 (2018).

    Article  MathSciNet  Google Scholar 

  • Bouchard, B, Touzi, N: Weak dynamic programming principle for viscosity solutions. SIAM J. Control. Optim. 49(3), 948–962 (2011).

    Article  MathSciNet  MATH  Google Scholar 

  • Breuer, T, Csiszár, I: Measuring distribution model risk. Math. Financ. 26(2), 395–411 (2016).

    Article  MathSciNet  MATH  Google Scholar 

  • Carver, L: Negative rates: Dealers struggle to price 0% floors. Risk Mag. (2012).

  • Cont, R: Model uncertainty and its impact on the pricing of derivative instruments. Math. Financ. 16, 519–542 (2006).

    Article  MathSciNet  MATH  Google Scholar 

  • Costantini, C, Papi, M, D’Ippoliti, F: Singular risk-neutral valuation equations. Financ. Stochast. 16(2), 249–274 (2012).

    Article  MathSciNet  MATH  Google Scholar 

  • Crandall, MG, Ishii, H, Lions, P-L: User’s guide to viscosity solutions of second order partial differential equations. Bull. Amer. Math. Soc. 27(1), 1–67 (1992).

    Article  MathSciNet  MATH  Google Scholar 

  • da Fonseca, J, Grasselli, M: Riding on the smiles. Quant. Financ. 11(11), 1609–1632 (2011).

    Article  MathSciNet  MATH  Google Scholar 

  • Denis, L, Martini, C: A theoretical framework for the pricing of contingent claims in the presence of model uncertainty. Ann. Appl. Probab. 16(2), 827–852 (2006).

    Article  MathSciNet  MATH  Google Scholar 

  • Denk, R, Kupper, M, Nendel, M: A semigroup approach to nonlinear Lévy processes (2017). arXiv:1710.08130v1.

  • Duffie, D, Filipović D, Schachermayer, W: Affine processes and applications in finance. Ann. Appl. Probab. 13, 984–1053 (2003).

    Article  MathSciNet  MATH  Google Scholar 

  • Eberlein, E, Madan, DB, Pistorius, M, Yor, M: Bid and ask prices as non-linear continuous time G-expectations based on distortions. Math. Financ. Econ. 8(3), 265–289 (2014).

    Article  MathSciNet  MATH  Google Scholar 

  • El Karoui N, Tan, X: Capacities, measurable selection and dynamic programming part I: Abstract framework (2013a). arXiv:1310.3363v1.

  • El Karoui, N, Tan, X: Capacities, measurable selection and dynamic programming part II: Application in stochastic control problems (2013b). arXiv:1310.3363v1.

  • Epstein, LG, Ji, S: Ambiguous volatility and asset pricing in continuous time. Rev. Financ. Stud. 26(7), 1740–1786 (2013).

    Article  Google Scholar 

  • Evans, LC: Partial differential equations. Grad. Stud. Math. Am. Math. Soc.19 (2012).

  • Feller, W: Two singular diffusion problems. Ann. Math. 54, 173–182 (1951).

    Article  MathSciNet  MATH  Google Scholar 

  • Filipović, D: Term Structure Models: A Graduate Course. Springer Verlag, Berlin Heidelberg New York (2009).

    Book  MATH  Google Scholar 

  • Fleming, WH, Soner, HM: Controlled Markov Processes and Viscosity Solutions. 2nd edn. Springer, New York (2006).

    MATH  Google Scholar 

  • Fouque, J-P, Ren, B: Approximation for option prices under uncertain volatility. SIAM J. Financ. Math. 5(1), 360–383 (2014).

    Article  MathSciNet  MATH  Google Scholar 

  • Gikhman, I: A short remark on Fellerś square root condition (2011). Available on SSRN.

  • Guillaume, F, Schoutens, W: Calibration risk: Illustrating the impact of calibration risk under the Heston model. Rev. Deriv. Res. 15(1), 57–79 (2012).

    Article  Google Scholar 

  • Guo, G, Tan, X, Touzi, N: Tightness and duality of martingale transport on the Skorokhod space. Stochast. Process. Appl. 127(3), 927–956 (2017).

    Article  MathSciNet  MATH  Google Scholar 

  • Guyon, J, Henry-Labordère, P: Nonlinear option pricing. Chapman and Hall/CRC Financial Mathematics Series (2013).

  • Heider, P: Numerical methods for non-linear Black-Scholes equations. Appl. Math. Financ. 17(1), 59–81 (2010).

    Article  MathSciNet  MATH  Google Scholar 

  • Heston, S: A closed-form solution for options with stochastic volatility and applications to bond and currency options. Rev. Financ. Stud. 6, 327–343 (1993).

    Article  MathSciNet  MATH  Google Scholar 

  • Jacod, J, Protter, P: Probability essentials. Springer Verlag Berlin Heidelberg GmbH, Heidelberg (2004).

    Book  MATH  Google Scholar 

  • Kallenberg, O: Foundations of modern probability, Probability and its Applications (New York). second edn. Springer-Verlag, New York (2002).

    MATH  Google Scholar 

  • Karatzas, I, Shreve, SE: Brownian Motion and Stochastic Calculus. Springer Verlag, Berlin Heidelberg New York (1988).

    Book  MATH  Google Scholar 

  • Kijima, M: Monotonicity and convexity of option prices revisited. Math. Financ. 12(4), 411–425 (2002).

    Article  MathSciNet  MATH  Google Scholar 

  • Madan, DB: Benchmarking in two price financial markets. Ann. Financ. 12(2), 201–219 (2016).

    Article  MathSciNet  MATH  Google Scholar 

  • Muhle-Karbe, J, Nutz, M: A risk-neutral equilibrium leading to uncertain volatility pricing. Financ. Stochast. 22(2), 281–295 (2018).

    Article  MathSciNet  MATH  Google Scholar 

  • Neufeld, A, Nutz, M: Measurability of semimartingale characteristics with respect to the probability law. Stochast. Process. Appl. 124(11), 3819–3845 (2014).

    Article  MathSciNet  MATH  Google Scholar 

  • Neufeld, A, Nutz, M: Nonlinear Lévy processes and their characteristics. Trans. Am. Math. Soc. 369(1), 69–95 (2017).

    Article  MATH  Google Scholar 

  • Nutz, M, van Handel, R: Constructing sublinear expectations on path space. Stochas. Process. Appl. 123(8), 3100–3121 (2013).

    Article  MathSciNet  MATH  Google Scholar 

  • Orlando, G, Mininni, RM, Bufalo, M: A revised approach to CIR short-term interest rates model (2016). Available at SSRN: or

  • Patel, J, Russo, V, Fabozzi, FJ: Using the right implied volatility quotes in times of low interest rates: An empirical analysis across different currencies. Financ. Res. Lett. 25, 196–201 (2018).

    Article  Google Scholar 

  • Peng, S: Backward SDE and related g-expectation. In: Backward stochastic differential equations, Vol. 364 of Pitman Res. Notes Math. Ser, pp. 141–159. Longman Scientific & Technical (1997).

  • Peng, S: G-Brownian motion and dynamic risk measure under volatility uncertainty. Lect. Notes (2007a).

  • Peng, S: G-expectation, G-Brownian motion and related stochastic calculus of Itô type. Stochast. Anal. Appl. 2, 541–567 (2007b).

    Article  MATH  Google Scholar 

  • Revuz, D, Yor, M: Continuous martingales and Brownian motion. Springer Verlag, Berlin (1999).

    Book  MATH  Google Scholar 

  • Russo, V, Fabozzi, FJ: Calibrating short interest rate models in negative rate environments. J. Deriv. 24(4), 80–92 (2017).

    Article  Google Scholar 

  • Vorbrink, J: Financial markets with volatility uncertainty. J. Math. Econ. 53, 64–78 (2014).

    Article  MathSciNet  MATH  Google Scholar 

  • Wilmott, P, Oztukel, A: Uncertain parameters, an empirical stochastic volatility model and confidence limits. Int. J. Theor. Appl. Financ. 1(1), 175–189 (1998).

    Article  MATH  Google Scholar 

Download references


We want to thank Tom Bielecki, Sam Cohen, Benedikt Geuchen, Mete Soner, Josef Teichmann, the participants of the ICERM workshop on Robust Methods in Probability and Finance, the participants of the FWZ seminar, and two anonymous referees for helpful comments. The numerical experiments were implemented by Jan Blechschmidt, TU Chemnitz. Financial support from the Carl-Zeiss-Stiftung and the DFG, as well as ETH RiskLab and the NAP Grant from NTU is gratefully acknowledged. We also thank the Freiburg Institute of Advanced Studies (FRIAS) for its hospitality and financial support.


Tolulope Fadina gratefully acknowledges funding from the Carl-Zeiss-Stiftung. Ariel Neufeld gratefully acknowledges funding from the NAP Grant and from RiskLab. Thorsten Schmidt and Jan Blechschmidt gratefully acknowledge funding from the Deutsche Forschungsgemeinschaft. We also thank the Freiburg Institute of Advanced Studies (FRIAS) for its hospitality and financial support.

Availability of data and materials

Not applicable.

Author information

Authors and Affiliations



All authors contributed equally to the paper. The author appearance is in alphabetical order. All authors read and approved the final manusccript

Corresponding author

Correspondence to Thorsten Schmidt.

Ethics declarations

Competing interests

Not applicable.

Rights and permissions

Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License(, which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.

Reprints and Permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Fadina, T., Neufeld, A. & Schmidt, T. Affine processes under parameter uncertainty. Probab Uncertain Quant Risk 4, 5 (2019).

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI:


  • Affine processes
  • Knightian uncertainty
  • Riccati equation
  • Vasiček model
  • Cox–Ingersoll–Ross model
  • Nonlinear Vasiček/CIR model
  • Heston model
  • Itô formula
  • Kolmogorov equation
  • Fully nonlinear PDE
  • Semimartingale