- Research
- Open Access
- Published:
Backward-forward linear-quadratic mean-field games with major and minor agents
Probability, Uncertainty and Quantitative Riskvolume 1, Article number: 8 (2016)
Abstract
This paper studies the backward-forward linear-quadratic-Gaussian (LQG) games with major and minor agents (players). The state of major agent follows a linear backward stochastic differential equation (BSDE) and the states of minor agents are governed by linear forward stochastic differential equations (SDEs). The major agent is dominating as its state enters those of minor agents. On the other hand, all minor agents are individually negligible but their state-average affects the cost functional of major agent. The mean-field game in such backward-major and forward-minor setup is formulated to analyze the decentralized strategies. We first derive the consistency condition via an auxiliary mean-field SDEs and a 3×2 mixed backward-forward stochastic differential equation (BFSDE) system. Next, we discuss the wellposedness of such BFSDE system by virtue of the monotonicity method. Consequently, we obtain the decentralized strategies for major and minor agents which are proved to satisfy the ε-Nash equilibrium property.
Introduction
Recently, the dynamic optimization of (linear) large-population system has attracted extensive research attentions from academic communities. Its most significant feature is the existence of numerous insignificant agents, denoted by \(\{\mathcal {A}_{i}\}_{i=1}^{N},\) whose dynamics and (or) cost functionals are coupled via their state-average. To design low-complexity strategies for large-population system, one efficient method is mean-field game (MFG) which enables us to derive the decentralized strategies. Interested readers may refer to Lasry and Lions (2007), Guéant et al. (2010) for the motivation and methodology, and Andersson and Djehiche (2011), Bardi (2012), Bensoussan et al. (2016), Buckdahn et al. (2009a, b, 2010, 2011), Carmona and Delarue (2013), Huang et al. (2006, 2007, 2012), Li and Zhang (2008) for recent progress of MFG theory. Our work is to consider the following large-population system involving a major agent \(\mathcal {A}_{0}\) and minor agents \(\{\mathcal {A}_{i}\}_{i=1}^{N}\):
and
where \(x^{(N)}(t)=\frac {1}{N}\sum \limits _{i=1}^{N}x_{i}(t)\) is state-average of all minor agents. Moreover, \(\mathcal {A}_{0}\) and \(\{\mathcal {A}_{i}\}_{1\leq i\leq N}\) can be further coupled via their cost functionals J _{0},J _{ i } as follows:
Formal assumptions on coefficients of states and costs will be given later. As addressed in (Carmona and Delarue 2013) and (Nourian and Caines 2013), the standard procedure of MFG (without \(\mathcal {A}_{0}\)) mainly consists of the following steps:
(Step i) Fix the state-average limit: \({\lim }_{N\longrightarrow +\infty } x^{(N)}\) by a frozen process \(\bar {x}\) and formulate an auxiliary stochastic control problem for \(\mathcal {A}_{i}\) which is parameterized by \(\bar {x}\).
(Step ii) Solve the above auxiliary stochastic control problem to obtain the decentralized optimal state \(\bar {x}_{i}\) (which should depend on the undetermined process \(\bar {x}\), hence denoted by \(\bar {x}_{i}(\bar {x})\)).
(Step iii) Determine \(\bar {x}\) by the fixed-point argument: \({\lim }_{N\longrightarrow +\infty } \frac {1}{N}\sum _{i=1}^{N}\bar {x}_{i} (\bar {x})=\bar {x}\).
As to the MFG with major-minor agent \((\mathcal {A}_{0}, \mathcal {A}_{i})\), Step (ii) can be further divided into:
(Step ii-a) First, solve the decentralized control problem for \(\mathcal {A}_{0}\) by replacing x ^{(N)} using \(\bar {x}.\) The related decentralized optimal state is denoted by \(\bar {x}_{0}(\bar {x})\) and optimal control by \(\bar {u}_{0}(\bar {x}).\)
(Step ii-b) Second, given \(\bar {x}_{0}(\bar {x})\) and \(\bar {u}_{0}(\bar {x})\) of \(\mathcal {A}_{0}\), solve the auxiliary stochastic control problem for \(\mathcal {A}_{i}\). The related decentralized states \(\bar {x}_{i}\) for \(\mathcal {A}_{i}\) should depend on \((\bar {x}, \bar {x}_{0}(\bar {x}))\), hence denoted by \(\bar {x}_{i}(\bar {x}, \bar {x}_{0}(\bar {x})\)).
(Step iii) is thus revised to fixed-point argument: \({\lim }_{N\longrightarrow +\infty } \frac {1}{N}\sum _{i=1}^{N}\bar {x}_{i} (\bar {x}, \bar {x}_{0}(\bar {x}))=\bar {x}.\)
The MFG with major-minor agent has been extensively studied: for example, Huang (2010) discussed MFG with a major agent and heterogenous minor agents parameterized by finite K classes; Nguyen and Huang (2012) further considered MFG with heterogenous minor agents parameterized by a continuum index set; Nourian and Caines (2013) studied MFG for nonlinear large population system involving major-minor agents; Buckdahn et al. (2014) discussed the MFG with major-minor agents in weak formulation where the “feedback control against feedback control” strategies are studied.
The modeling novelty of this paper, is to consider a major-minor agent system with backward major, namely, the state of \(\mathcal {A}_{0}\) satisfies a backward stochastic differential equation (BSDE):
Unlike forward SDE with given initial condition x _{0}, the terminal condition ξ is pre-specified in BSDE as a priori and its solution becomes an adapted process pair (x _{0},z _{0}). The linear BSDEs were first introduced by Bismut (1978) and the general nonlinear BSDE was first studied in Pardoux and Peng (1990). The BSDE has been applied broadly in many fields such as mathematical economics and finance, decision making and management science. One example is the representation of stochastic differential recursive utility by a class of BSDE (Duffie and Epstein (1992), El Karoui et al. (1997), Wang and Wu (2009), etc.). A BSDE coupled with a SDE in their terminal conditions formulates the forward-backward stochastic differential equation (FBSDE). The FBSDE has also been well studied and the interested readers may refer Antonelli (1993), Cvitanić and Ma (1996), Hu and Peng (1995), Ma et al. (1994, 2015), Ma and Yong (1999), Peng and Wu (1999), Wu (2013), Yong (1997, 2010), Yong and Zhou (1999), Yu (2012) and the references therein for more details of FBSDEs.
The modeling of major agent by BSDE and minor agents by forward SDE, is well motivated and can be illustrated by the following example. In a natural resource exploitation industry, there exist a large number of small exploitation firms \(\{\mathcal {A}_{i}\}_{i=1}^{N}\) which are more aggressive in their business activities. Accordingly, their cost functionals are based on forward SDEs with given initial conditions. Here, these initial conditions can be interpreted as their initial investments or deposits for exploitation licenses. On the other hand, the major agent \(\mathcal {A}_{0}\) acts as some dominating administration party such as local government or regulation bureau. As the administrator, \(\mathcal {A}_{0}\) is more conservative hence its state can be modeled by a linear BSDE for which the terminal condition is specified. Such terminal condition can be interpreted as a future target or objective such as tax revenue from exploitation industry, or environmental protection index related to natural resource.
The modeling of backward-major and forward-minors will yield a large-population system with backward-forward stochastic differential equation (BFSDE), which is structurally different to FBSDE in the following aspects. First, the forward and backward equations will be coupled in their initial instead terminal conditions. Second, unlike FBSDE, there is no feasible decoupling structure by the standard Riccati equations, as addressed in Lim and Zhou (2001). This is mainly because some implicit constraints in initial conditions should be satisfied in the possible decoupling.
The introduction of BFSDE also brings some technical differences to its MFG studies. First, as addressed in (Step i), the state-average limit of minor agents will be frozen. Then, by (ii-a), the optimal state of major agent should follow a BFSDE system. This is because the major state follows some BSDE, thus its adjoint process should be a forward SDE. These two equations will be further coupled in their initial conditions. Therefore, we will get some BFSDE instead the classical FBSDE from standard forward major-forward minor MFG. Next, as suggested by (ii-b), the given minor agent will solve some optimal control problem with augmented state: its own state, state-average limit, optimal state of major agent from (ii-a), which is a BFSDE. The minor agent’s optimal control should involve some feedback of this augmented state. In this way, the minor’s optimal state will be represented through some coupled system of its own state, the major’s agent, the state-average limit as well as one inhomogeneous equation (which is another BSDE because the state-average limit depends on major’s agent, thus it should be a random process in general). Last, as specified in (iii), taking summation of all individual minor agents’ states should reduce to the state-average limit frozen in (i). Consequently, more complicated consistency condition system should be derived in our current backward major-forward minor setup.
Based on the above step scheme, the related mean-field LQG games for backward-major and forward-minor system will be proceeded rather differently, comparing to the standard MFG analysis for forward major-minor systems. In particular, the decentralized strategies for major and minor agents will be based on a new consistency condition (see our analysis in Section “The limiting optimal control and NCE equation system”). Accordingly, a stochastic process which relates to state of major player is introduced here to approximate the state-average. An auxiliary mean-field SDE and a 3×2 FBSDE system are introduced and analyzed. Here, the 3×2 FBSDE, which is also called a triple FBSDE, comprises three forward and three backward equations. Applying the monotonic method in Peng and Wu (1999) and Yu (2012), we obtain the wellposedness of this FBSDE. In addition, the decoupling of backward-forward SDE using Riccati equation is also different to that of standard forward-backwards SDE. The ε-Nash equilibrium property of decentralized control strategy with \(\epsilon =O(1/\sqrt N)\) is also derived.
The rest of this paper is organized as follows. Section “Preliminaries and problem formulation” formulates the large population LQG games of backward-forward systems. In Section “The limiting optimal control and NCE equation system”, the limiting optimal controls of the track systems and consistency conditions are derived. Section “ ε-Nash equilibrium analysis” is devoted to the related ε-Nash equilibrium property. “Conclusion and future work section” serves as a conclusion to our study.
Preliminaries and problem formulation
Throughout this paper, we denote by \(\mathbb {R}^{m}\) the m-dimensional Euclidean space. Consider a finite time horizon [0,T] for a fixed T>0. Suppose \((\Omega, \mathcal F, \{\mathcal F_{t}\}_{0\leq t\leq T}, P)\) is a complete filtered probability space on which a standard (d+m×N)-dimensional Brownian motion {W _{0}(t),W _{ i }(t), 1≤i≤N}_{0≤t≤T } is defined. We define \(\mathcal F^{w_{0}}_{t}:=\sigma \{W_{0}(s), 0\leq s\leq t\}, \mathcal F^{w_{i}}_{t}:=\sigma \{W_{i}(s), 0\leq s\leq t\}, \mathcal {F}^{i}_{t}:=\sigma \{W_{0}(s),W_{i}(s);0\leq s\leq t\}\). Here, \(\{\mathcal F^{w_{0}}_{t}\}_{0\leq t\leq T}\) represents the information of the major player, while \(\{\mathcal F^{w_{i}}_{t}\}_{0\leq t\leq T}\) the individual information of i ^{th} minor player. For a given filtration \(\{\mathcal G_{t}\}_{0\leq t\leq T},\) let \(L^{2}_{\mathcal {G}_{t}}(0, T; \mathbb {R}^{m})\) denote the space of all \(\mathcal {G}_{t}\)-progressively measurable processes with values in \(\mathbb {R}^{m}\) satisfying \(\mathbb {E}{\int _{0}^{T}}|x(t)|^{2}dt<+\infty ; L^{2}(0, T; \mathbb {R}^{m})\) denote the space of all deterministic functions defined on [0,T] in \(\mathbb {R}^{m}\) satisfying \({\int _{0}^{T}}|x(t)|^{2}dt<+\infty ; C(0,T;\mathbb {R}^{m})\) denote the space of all continuous functions defined on [0,T] in \(\mathbb {R}^{m}\). For simplicity, in what follows we focus on the 1-dimensional processes, which means d=m=1.
Consider a large population system with (1+N) individual agents, denoted by \(\mathcal {A}_{0}\) and \(\{\mathcal {A}_{i}\}_{1 \leq i \leq N},\) where \(\mathcal {A}_{0}\) stands for the major player, while \(\mathcal {A}_{i}\) stands for i ^{th} minor player. For sake of illustration, we restate the states of major-minor agents as follows, and give the necessary assumptions on coefficients. The dynamics of \(\mathcal {A}_{0}\) is given by a BSDE as follows:
where \(\xi \in \mathcal {F}^{w_{0}}_{T}\) satisfies \(\mathbb E|\xi |^{2}<+\infty.\) The state of minor player \(\mathcal {A}_{i}\) is a SDE satisfying
where \(x^{(N)}(t)=\frac {1}{N}\sum \limits _{i=1}^{N}x_{i}(t)\) is the state-average of minor players; x _{ i0} is the initial value of \(\mathcal {A}_{i}\). Here, A _{0},B _{0},C _{0},A,B,D,α,σ are scalar constants. Assume that \(\mathcal F_{t}\) is the augmentation of σ{W _{0}(s),W _{ i }(s),x _{ i0};0≤s≤t,1≤i≤N} by all the P-null sets of \(\mathcal {F}\), which is the full information accessible to the large population system up to time t. Let U _{ i }, i=0,1,2,…,N be subsets of \(\mathbb {R}\). The admissible control strategy \(u_{0}\in \mathcal {U}_{0},u_{i}\in \mathcal {U}_{i}\), where
and
Let u=(u _{0},u _{1},⋯,u _{ N }) denote the set of control strategies of all (1+N) agents; u _{−0}=(u _{1},u _{2},⋯,u _{ N }) the control strategies except \(\mathcal {A}_{0}\); u _{−i }=(u _{0},u _{1},⋯,u _{ i−1},u _{ i+1},⋯,u _{ N }) the control strategies except the i ^{th} agent \(\mathcal {A}_{i},1\leq i\leq N\). The cost functional for \(\mathcal {A}_{0}\) is given by
where \(Q_{0}\geq 0, \tilde {Q}\geq 0, R_{0}>0, H_{0}\geq 0\). The individual cost functional for \(\mathcal {A}_{i}, 1\leq i\leq N\), is
where Q≥0,R>0,H≥0.
Remark 2.1
Unlike (Huang 2010;Nguyen and Huang 2012;Nourian and Caines 2013), the dynamics of the major agent in our work is a BSDE with a terminal condition as a priori. The term \(H_{0}{x_{0}^{2}}(0)\) is thus introduced in (3) to represent some recursive evaluation. One of its practical implications is the initial hedging deposit in the pension fund industry. For the sake of simplicity, behaviors of the major agent (e.g., the government, as presented in the example above) affect the state of minor agents (which can be understood as numerous individual and negligible firms or producers). Moreover, the major and minor agents are further coupled via the state-average.
Remark 2.2
The cost functional (3) takes some linear combination weighted by Q _{0} and \(\tilde {Q}.\) Regarding this point, (3) enables us to represent some trade-off between the absolute quadratic cost \({x^{2}_{0}}(t)\) and relative quadratic deviation (x _{0}(t)−x ^{(N)}(t))^{2}. This functional combination can be interpreted as some balance between the minimization of its own cost and the benchmark index tracking to the minor agents’ average. Moreover, such tracking can be framed into the relative performance setting. Similar work can be found in Espinosa and Touzi (2015), where the relative performance is formulated by some convex combination \(\lambda \left (x_{i}(t)-x^{(N)}(t)\right)^{2}+(1-\lambda) {x^{2}_{0}}(t), \lambda \in [0,1]\).
We introduce the following assumption: (H1) \(\{x_{i0}\}_{i=1}^{N}\) are independent and identically distributed (i.i.d) with \(\mathbb {E}x_{i0}=x\), \(\mathbb {E}|x_{i0}|^{2}<+\infty,\) and also independent of {W _{0},W _{ i },1≤i≤N}.
It follows that (1) admits a unique solution for all \(u_{0} \in \mathcal {U}_{0}\), (see Pardoux and Peng (1990)). It is also well known that under (H1), (2) admits a unique solution for all \(u_{i} \in \mathcal {U}_{i}, 1\leq i\leq N\). Now, we formulate the large population dynamic optimization problem.
Problem (I). Find a control strategies set \(\bar {u}=(\bar {u}_{0},\bar {u}_{1},\cdots,\bar {u}_{N})\) which satisfies
where \(\bar {u}_{-0}\) represents \((\bar {u}_{1},\bar {u}_{2},\cdots, \bar {u}_{N})\) and \(\bar {u}_{-i}\) represents \((\bar {u}_{0},\bar {u}_{1},\cdots,\bar {u}_{i-1}, \bar {u}_{i+1},\cdots, \bar {u}_{N})\), for 1≤i≤N.
The limiting optimal control and NCE equation system
Combining the major’s state with forcing equation (BSDE with null terminal condition), we naturally have the following formulation of limit representation. To obtain the feedback control and the desired results, we assume \(U_{i}=\mathbb {R}\) for i=0,1,2,…,N.
Suppose x ^{(N)}(·) is approximated by \(\bar {x}(\cdot)\) as N→+∞. Introduce the following auxiliary dynamics of major and minor players, still denoted by x _{0}(·),x _{ i }(·), respectively:
and
Note that the coefficients \((\bar {A}(\cdot),\bar {B}(\cdot),\bar {C}(\cdot),\tilde {A}(\cdot),\tilde {B}(\cdot),\tilde {C}(\cdot))\in L^{2}(0,T;\mathbb {R}^{6})\) are still to be determined. The associated limiting cost functionals become
and
Thus, we formulate the limiting LQG game (II) as follows.
Problem (II). For i ^{th} agent \(\mathcal {A}_{i}\), i=0,1,2,⋯,N, find \(\bar {u}_{i}\in \mathcal {U}_{i}\) satisfying
\(\bar {u}_{i}\) satisfying (9) is called an optimal control for (II).
Remark 3.1
Since \(\bar {x}(t)\) is regarded as the approximated process of state average x ^{(N)}(t), we replace x ^{(N)}(t) by \(\bar {x}(t)\) in Problem (II). In what follows, (II) is called the limiting problem of (I) as N→+∞. As referred to at the beginning of this section, we are going to deal with this limiting problem first. Then, we will focus on the ε−Nash equilibrium between (I) and (II), which is the biggest difference with the usual Nash equilibrium problem.
Remark 3.2
By noting that each minor player’s state x _{ i }(t) in (2) depends on the major player’s state x _{0}(t) explicitly, we claim that the limiting process \(\bar {x}(t)\) also depends on x _{0}(t) explicitly. In fact, the third process k(t) is also meaningful, which is a stochastic process introduced in decoupling the Hamilton system. Hereinafter, we will show it.
Remark 3.3
Since the state-average of minor players appears only in the cost functional of the major player, the first equation in (5) has the same form as (1), actually. However, for regularity, we still write it out.
To get the optimal control of Problem (II), we should obtain the optimal control of \(\mathcal {A}_{0}\) first. We have the following lemma.
Lemma 3.1
Corresponding to the forward-backward system (5) and (7), the optimal control of \(\mathcal {A}_{0}\) for (II) is given by
where the adjoint process p _{0}(·)and the corresponding optimal trajectory \((\hat {x}_{0}(\cdot),\hat {z}_{0}(\cdot))\) satisfy the following Hamilton system
where \(\theta (\cdot),\bar {\theta }(\cdot)\in L^{2}_{\mathcal {F}^{w_{0}}}(0, T; \mathbb {R})\).
Proof
For the variation of control \(\delta u_{0}(\cdot)\in L^{2}_{\mathcal {F}^{w_{0}}}(0, T; \mathbb {R})\), which is an arbitrary control process such that \(u_{0}(\cdot)=\bar u_{0}(\cdot)+\delta \cdot \delta u_{0}(\cdot)\in L^{2}_{\mathcal {F}^{w_{0}}}(0, T; \mathbb {R})\), introduce the following variational equations:
Applying Itô’s formula to \(p_{0}(t)\delta x_{0}(t)+p(t) \delta \bar {x}(t)+q(t)\delta k(t)\) and noting the associated first-order variation of cost functional:
we obtain the optimal control (10). Combining all state equations and adjoint equations, and applying \(\bar {u}_{0}(\cdot)\) to \(\mathcal {A}_{0}\), we get the Hamilton system (11). □
After obtaining the optimal control of major player \(\mathcal {A}_{0}\), in what follows we aim to get the optimal control \(\bar {u}_{i}\) of minor player \(\mathcal {A}_{i}\), with corresponding optimal trajectory \(\hat {x}_{i}(\cdot)\).
Lemma 3.2
Under (H1), the optimal control of \(\mathcal {A}_{i}\) for (II) is
where the adjoint process p _{ i }(·)and the corresponding optimal trajectory \(\hat {x}_{i}(\cdot)\) satisfy BSDE
and SDE
Here \(\theta _{0}(\cdot),\theta _{i}(\cdot)\in L^{2}_{\mathcal {F}^{i}}(0, T; \mathbb {R})\); \(\hat {x}_{0}(\cdot)\), and \(\bar {x}(\cdot)\) are given by (11). The proof is similar to that of Lemma 3.1 and omitted. For the coupled BFSDE (14) and (15), we are going to decouple it and try to derive the Nash certainty equivalence (NCE) system satisfied by the decentralized control policy. Then we have the following lemma.
Lemma 3.3
Suppose P(·) is the unique solution of the following Riccati equation
then we obtain the following Hamilton system:
which is a 3×2 FBSDE.
Proof
Suppose
where P _{ i }(·),f _{ i }(·) are to be determined. Here, P _{ i }(·) is differentiable and f _{ i }(·) is an Itô process. The terminal condition \(p_{i}(T)= H\hat {x}_{i}(T)\) implies that
Applying Itô’s formula to \(P_{i}(t)\hat {x}_{i}(t)+f_{i}(t)\), we have
Comparing the coefficients with (14), we get θ _{ i }(t)=σ P _{ i }(t),
and
Noting that Riccati Eq. (18) is symmetric, it is well known that (18) admits a unique nonnegative bounded solution P _{ i }(·) (see (Ma and Yong 1999)). Further we get that P _{1}(·)=P _{2}(·)=⋯=P _{ N }(·):=P(·). Thus, (18) coincides with (16). Besides, for given \(\bar {x}(\cdot),\hat {x}_{0}(\cdot)\in L^{2}_{\mathcal F^{w_{0}}}(0,T; \mathbb {R})\), the linear BSDE (19) admits a unique solution \(f_{i}(\cdot)\in L^{2}_{\mathcal {F}^{w_{0}}}(0, T; \mathbb {R})\). We denote f _{ i }(·):=f(·),i=1,2,⋯,N.
Therefore, the decentralized feedback strategy for \(\mathcal {A}_{i},1\leq i\leq N\) is written as
where x _{ i }(·) is the state of minor player \(\mathcal {A}_{i}\). Plugging (20) into (2) implies the centralized closed-loop state:
Taking the summation, dividing by N, and letting N→+∞, we get
Comparing the coefficients with the second equation of (5), we have
Then we obtain
Noting the third equation of (5), it follows that
Then (17) is obtained, which completes the proof. □
Remark 3.4
The proof of Lemma 3.3 implies that k(·)=f(·). Thus, k(·), which is first introduced in (5), has some specific meaning that it is indeed a force function when decoupling (14) and (15).
To get the wellposedness of (17), we give the following assumption. (H2) \(B_{0}\neq 0,\ H_{0}>0,\ \tilde {Q}>0.\)
Theorem 3.1
Under (H2), FBSDE (17) is uniquely solvable.
Proof
Uniqueness.
It is easily checked that (16) admits a unique nonnegative bounded solution (see (Ma and Yong 1999)). For the sake of notational convenience, in (17) we denote by b(ϕ),σ(ϕ) the coefficients of drift and diffusion terms, respectively, for \(\phi =p_{0},\bar {x},q\); denote by f(ψ) the generator for \(\psi =\hat {x}_{0},p,k\).
Define \(\Delta :=(p_{0},\bar {x},q,\hat {x}_{0},p,k,\hat {z}_{0},\bar {\theta },\theta _{0})\), similar to the notation in (Peng and Wu 1999), we denote by
which implies \(\mathbb {A}(t,\Delta)=\left (A_{0}\hat {x}_{0}-{B_{0}^{2}}R_{0}^{-1}p_{0}+C_{0}\hat {z}_{0},-\left (A+D-B^{2}R^{-1}P(t)\right)\right. p+Q_{0}(\hat {x}_{0}-\bar {x})-\left (Q-DP(t)\right)q,\left (-A+B^{2}R^{-1}P(t)\right)k+\left (Q-DP(t)\right)\bar {x}- \alpha P(t)\hat {x}_{0},-A_{0}p_{0}-Q_{0}(\hat {x}_{0}-\bar {x})-\tilde {Q}\hat {x}_{0}-\alpha p+\alpha P(t)q,\left (A+D-B^{2}R^{-1}P(t)\right)\bar {x}- \left. B^{2}R^{-1}k+\alpha \hat {x}_{0},\left (A-B^{2}R^{-1}P(t)\right)q+B^{2}R^{-1}p,-C_{0}p_{0},0,0{\vphantom {A_{0}\hat {x}_{0}-{B_{0}^{2}}R_{0}^{-1}p_{0}+C_{0}\hat {z}_{0},-(A+D-B^{2}R^{-1}P(t)}}\right).\)
Then for any \(\Delta ^{i}=({p_{0}^{i}},\bar {x}^{i},q^{i},\hat {x}_{0}^{i},p^{i},k^{i},\hat {z}_{0}^{i},\bar {\theta }^{i},{\theta ^{i}_{0}}),i=1,2,\) we have
In the following, we are first going to show that (17) admits at most one adapted solution. Suppose Δ and \(\Delta ^{'}=(p'_{0},\bar {x}',q',\hat {x}'_{0},p',k',\hat {z}^{'}_{0},\bar {\theta }^{'},\theta ^{'}_{0})\) are two solutions of (17). Setting \(\hat {\Delta }=(\hat {p}_{0},\hat {\bar {x}},\hat {q},\hat {\hat {x}}_{0},\hat {p},\hat {k},\hat {\hat {z}}_{0},\hat {\bar {\theta }},\hat {\theta }_{0}) =(p_{0}-p'_{0},\bar {x}-\bar {x}',q-q',\hat {x}_{0}-\hat {x}'_{0},p-p',k-k',\hat {z}_{0}-\hat {z}'_{0},\bar {\theta }-\bar {\theta }^{'},\theta _{0}-\theta ^{'}_{0})\) and applying Itô’s formula to \(\langle \hat {p}_{0},\hat {\hat {x}}_{0}\rangle +\langle \hat {\bar {x}},\hat {p}\rangle +\langle \hat {q},\hat {k}\rangle \), we have
It follows that
By (H2), we get β _{1}>0 and β _{2}>0. Then \(\hat {p}_{0}(s)\equiv 0\), \(\hat {x}_{0}(s)\equiv 0\). Further \(\hat {\hat {z}}_{0}(s)\equiv 0\). Applying the basic technique to \(\hat {\bar {x}}(s)\) and \(\hat {k}(s)\), and using Gronwall’s inequality, we obtain \(\hat {\bar {x}}(s)\equiv 0\), \(\hat {k}(s)\equiv 0\) and \(\hat {\theta }_{0}(s)\equiv 0\). Similarly, we have \(\hat {q}(s)\equiv 0\), \(\hat {p}(s)\equiv 0\), and \(\hat {\bar {\theta }}(s)\equiv 0\). Therefore, (17) admits at most one adapted solution.
Existence. In order to prove the existence of the solution, we first consider the following family of FBSDEs parameterized by γ∈[0,1]:
where \((\varphi ^{1},\varphi ^{2},\varphi ^{3},\lambda,\kappa ^{1},\kappa ^{2},\kappa ^{3})\in L^{2}_{\mathcal F^{w_{0}}}(0,T;\mathbb {R}^{7})\), \(a\in L^{2}(\Omega,\mathcal {F}^{w_{0}}_{0},P;\mathbb {R})\). Clearly, when γ=1, the existence of (23) implies that of (17). When γ=0, it is easy to obtain that (23) admits a unique solution (actually, the 2-dim FBSDE is very similar to the Hamiltonian system of (Lim and Zhou 2001)).
If, a priori, for each \(\left (\varphi ^{1},\varphi ^{2},\varphi ^{3},\lambda,\kappa ^{1},\kappa ^{2},\kappa ^{3}\right)\in L^{2}_{\mathcal {F}^{w_{0}}}(0,T;\mathbb {R}^{7})\) and a certain number γ _{0}∈[0,1) there exists a unique tuple \((p_{0}^{\gamma _{0}},\bar {x}^{\gamma _{0}},q^{\gamma _{0}},\hat {x}_{0}^{\gamma _{0}},p^{\gamma _{0}},k^{\gamma _{0}}, \hat {z}_{0}^{\gamma _{0}},\bar {\theta }^{\gamma _{0}},\theta _{0}^{\gamma _{0}})\) of (23), then for each
there exists a unique tuple \(U_{s}\!=(\!P_{0}(s),\bar {X}(s),Q(s),\hat {X}_{0}(s),P(s),K(s),\hat {Z}_{0}(s),\bar {\Theta }(s), \Theta _{0}(s))\in L^{2}_{\mathcal {F}^{w_{0}}_{s}}(0,T; \mathbb {R}^{9})\) satisfying the following FBSDEs
In the following, we aim to prove that the mapping defined by
is a contraction.
Introduce \(u'=(p'_{0},\bar {x}',q',\hat {x}^{'}_{0},p',k',\hat {z}^{'}_{0},\bar {\theta }',\theta ^{'}_{0})\in L^{2}_{\mathcal F^{w_{0}}}(0,T;\mathbb {R}^{9})\), \(U'\times \hat {X}'_{0}(0)=I_{\gamma _{0}+\delta }(u'\times \hat {x}^{'}_{0}(0))\) and set
Applying Itô’s formula to \(\langle \hat {P}_{0},\hat {X}_{0}\rangle +\langle \hat {\bar {X}},\hat {P}\rangle +\langle \hat {Q},\hat {K}\rangle \), we have
On the other hand, since P _{0} and \(P^{\prime }_{0}\) are solutions of SDEs with Itô’s type, applying the usual technique, the estimate for the difference \(\hat {P}_{0}=P_{0}-P'_{0}\) is obtained by
Similarly, estimates for the difference \(\hat {\bar {X}}=\bar {X}-\bar {X}'\) and \(\hat {Q}=Q-Q'\) are given by
and
respectively, for ∀0≤r≤T. In the same way, for the difference of the solutions \((\hat {\hat {X}}_{0},\hat {\hat {Z}}_{0})=(\hat {X}_{0}-\hat {X}^{'}_{0},\hat {Z}_{0}-\hat {Z}^{'}_{0}), (\hat {P},\hat {\bar {\Theta }})=(P-P',\bar {\Theta }-\bar {\Theta }')\) and \((\hat {K},\hat {\Theta }_{0})=(K-K',\Theta _{0}-\Theta ^{'}_{0})\), applying the usual technique to the BSDEs, we have
and
for ∀ 0≤r≤T. Here the constant C _{1} depends on the coefficients of (1)–(2), P(·), β _{1}, β _{2}, and \(\mathcal {T}\). γ _{0} H _{0}+(1−γ _{0})≥μ, μ= min(1,H _{0})>0.
Under (H2), combining (25), (27)–(28), (30)–(31), and applying Gronwall’s inequality, we obtain
where C _{2} depends on C _{1}, μ, and T. Choosing \(\delta _{0}=\frac {1}{2C_{2}}\), we get that for each fixed δ∈[0,δ _{0}], the mapping \(I_{\gamma _{0}+\delta }\) is a contraction in the sense that
Then it follows that there exists a unique fixed point
which is the solution of (23) for γ=γ _{0}+δ. Since δ _{0} depends only on (C _{1},μ,T), we can repeat this process N times with 1≤N δ _{0}<1+δ _{0}.
Then it follows that, in particular, as γ=1 corresponding to \({\varphi ^{i}_{t}}\equiv 0,\lambda _{t}\equiv 0,{\kappa ^{i}_{t}}\equiv 0,a=0\ (i=1,2,3)\), (23) admits a unique solution, which implies the wellposedness of (17) (also (11)). The proof is complete. □
Remark 3.5
In what follows, (17) is called the Nash certainty equivalence (NCE) equation system (see (Huang 2010;Huang et al. 2007;2012;Huang et al. 2006)). By Theorem 3.1, we know that there exists a unique 9-tuple solution \((p_{0},\bar {x},q,\hat {x}_{0},p,k,\hat {z}_{0},\bar {\theta },\theta _{0})\) which can be obtained off-line. Thus, it is equivalent with the fixed-point principle. To the best of our knowledge, this is the first paper to focus on the well-posedness of coupled FBSDE in large population problems.
ε-Nash equilibrium analysis
In above sections, we obtained the optimal control \(\bar {u}_{i}(\cdot), 0\le i\le N\) of Problem (II) through the consistency condition system. Now, we turn to verify the ε-Nash equilibrium of Problem (I). To start, we first present the definition of ε-Nash equilibrium.
Definition 4.1
A set of controls \(u_{k}\in \mathcal {U}_{k},\ 0\leq k\leq N,\) for (N+1) agents is called to satisfy an ε-Nash equilibrium with respect to the costs J _{ k }, 0≤k≤N, if there exists ε≥0 such that for any fixed 0≤i≤N, we have
when any alternative control \(u^{\prime }_{i}\in \mathcal {U}_{i}\) is applied by \(\mathcal {A}_{i}\).
If ε=0, then Definition 4.1 is reduced to the usual Nash equilibrium. Now, we state the main result of this paper and its proof will be given later.
Theorem 4.1
Under (H1)–(H2), \((\tilde {u}_{0},\tilde {u}_{1},\tilde {u}_{2},\cdots,\tilde {u}_{N})\) satisfies the ε-Nash equilibrium of (I). Here, \(\tilde {u}_{0}\) is given by
where p _{0}(·)is obtained off-line by (17); while for \(1\le i\le N, \tilde {u}_{i}\) is
where \(\tilde {x}_{i}(\cdot)\), the state trajectory for \(\mathcal {A}_{i}\), satisfies (21).
The proof of above theorem needs several lemmas which are presented later. Denote by \((\tilde {x}_{0}(\cdot),\tilde {z}_{0}(\cdot))\) the centralized state trajectory; \((\hat {x}_{0}(\cdot),\hat {z}_{0}(\cdot))\) the decentralized one. Applying \(\tilde {u}_{0}(\cdot)\) to \(\mathcal {A}_{0}\) and using the notations above, it is easy to know that \((\tilde {x}_{0}(\cdot),\tilde {z}_{0}(\cdot))\equiv (\hat {x}_{0}(\cdot),\hat {z}_{0}(\cdot))\). Further, \((\bar {x}(\cdot),k(\cdot))_{\tilde {x}_{0}}=(\bar {x}(\cdot),k(\cdot))_{\hat {x}_{0}}\). Hereafter, for any \(h_{j}(\cdot)\in L^{2}_{\mathcal {F}}(0,T;\mathbb {R}),j=1,2,3\); denote by \((h_{1}(\cdot),h_{2}(\cdot))_{h_{3}}\phantom {\dot {i}\!}\) the stochastic process pair (h _{1}(·),h _{2}(·)) which is determined by h _{3}(·). The cost functionals for (I) and (II) are given by
and
respectively. For \(\mathcal {A}_{i},1\leq i\leq N\), we have the following closed-loop system
with the cost functional
where \(\tilde {x}^{(N)}(t)=\frac {1}{N}\sum \limits ^{N}_{i=1}\tilde {x}_{i}(t)\). The auxiliary system (limiting problem) is given by
with the cost functional
where \((\bar {x}(t)_{\hat {x}_{0}},k(t)_{\hat {x}_{0}})\) satisfies (17). We have
Lemma 4.1
Proof
By (37), we have
where \( x^{(N)}_{0}:=\frac {1}{N}\sum \limits _{i=1}^{N}x_{i0}\). Noting that
by (17) and Gronwall’s inequality, we obtain (41).
It is easy to get that \(\sup \limits _{0\leq t\leq T}\mathbb {E}\big |\hat {x}_{0}(t)-\bar {x}(t)_{\hat {x}_{0}}\big |^{2}<+\infty \). Applying the Cauchy–Schwarz inequality, we have
In addition, by (10) and (33), we have \(\tilde {u}_{0}(\cdot)=\hat {u}_{0}(\cdot).\) Thus, (42) is obtained. □
For minor agents, we have
Lemma 4.2
Proof
For ∀ 1≤i≤N, applying Gronwall’s inequality, we get (44) from (41), (37) and (39). (45) follows from (44) and (34), obviously. Using the same technique as (43) and noting \(\sup \limits _{0\leq t\leq T}\mathbb {E}\big |\hat {x}_{i}(t)-\bar {x}(t)_{\hat {x}_{0}}\big |^{2}<+\infty,\sup \limits _{0\leq t\leq T}\mathbb {E}\big |\bar {u}_{i}(t)\big |^{2}<+\infty,\sup \limits _{0\leq t\leq T}\mathbb {E}\big |\hat {x}_{i}(t)\big |^{2}<+\infty \), we obtain (46). □
Until now, we have studied some estimates of states and costs corresponding to control \(\tilde {u}_{i}\) and \(\bar {u}_{i}, 0\le i\le N\). Next, we will focus on the ε-Nash equilibrium for (I). Consider a perturbed control \(u_{0} \in \mathcal {U}_{0}\) for \(\mathcal {A}_{0}\) and introduce the dynamics
whereas minor players keep the control \(\tilde {u}_{i},1\leq i\leq N,\) i.e.,
where \(l^{(N)}(t)=\frac {1}{N}\sum \limits _{k=1}^{N}l_{k}(t)\); \(k(t)_{l_{0}}\) associated with l _{0} satisfies
And for any fixed i, 1≤i≤N, consider a perturbed control \(u_{i} \in \mathcal {U}_{i}\) for \(\mathcal {A}_{i}\), whereas the major and other minor players keep the control \(\tilde {u}_{j},0\leq j\leq N,j\neq i.\) Introduce the dynamics
and for 1≤j≤N, j≠i,
where \(m^{(N)}(t)=\frac {1}{N}\sum \limits _{k=1}^{N}m_{k}(t)\); \(k(t)_{\tilde {x}_{0}}\) satisfies (17) due to \(\tilde {x}_{0}(\cdot)=\hat {x}_{0}(\cdot)\).
If \(\tilde {u}_{j},\ 0\leq j\leq N\) is an ε-Nash equilibrium with respect to cost J _{ j }, it holds that
Then, when making the perturbation, we just need to consider \(u_{j}\in \mathcal {U}_{j}\) such that \(J_{j}(u_{j},\tilde {u}_{-j})\leq J_{j}(\tilde {u}_{j},\tilde {u}_{-j}),\) which implies
In the limiting cost functional \(\bar {J}_{j}\), by the optimality of \((\bar {x}_{j},\bar {u}_{j})\), we get that \((\bar {x}_{j},\bar {u}_{j})\) is L ^{2}-bounded. Then we obtain the boundedness of \(\bar {J}_{j}(\bar {u}_{j})\), i.e.,
where C _{3} is a positive constant and independent of N. Then we have the following proposition.
Proposition 4.1
\(\sup \limits _{0\leq t\leq T}\!\mathbb {E}\big |l_{0}(t)\big |^{2}\), \(\sup \limits _{1\le k\le N}\!\!\left [\sup \limits _{0\leq t\leq T}\mathbb {E}\big |l_{k}(t)\big |^{2} \right ], \sup \limits _{1\le k\le N}\!\!\left [\sup \limits _{0\leq t\leq T}\mathbb {E}\big |m_{k}(t)\big |^{2} \right ]\) are bounded.
Proof
By (52), applying the usual technique of BSDE, we get the boundedness of \(\sup \limits _{0\leq t\leq T}\mathbb {E}\big |l_{0}(t)\big |^{2}\). It follows from (48) that
From (50) and (51), it holds that
Here, C _{4} and C _{5} are both positive constants. Since \(\sup \limits _{0\leq t\leq T}\mathbb {E}\big |l_{0}(t)\big |^{2}\) is bounded, we get the boundedness of \(\sup \limits _{0\leq t\leq T}\mathbb {E}\big |k(t)_{l_{0}}\big |^{2}\) by (49). It follows from (52) that \(\mathbb {E}|u_{i}(\cdot)|^{2}\) is bounded. Besides, the optimal controls \(\tilde {u}_{k}(\cdot),k\neq i\) is L ^{2}-bounded. Then by Gronwall’s inequality, it follows that
Thus, for any \(1\leq k\leq N, \sup \limits _{0\leq t\leq T}\mathbb {E}|l_{k}(t)|^{2}\) and \(\sup \limits _{0\leq t\leq T}\mathbb {E}|m_{k}(t)|^{2}\) are bounded. Hence the result. □
Correspondingly, the dynamics for agent \(\mathcal {A}_{0}\) under control u _{0} for (II) is as follows
and for agent \(\mathcal {A}_{i},1\leq i\leq N\),
where \((k(t)_{l'_{0}},\bar {x}(t)_{l'_{0}})\) associated with \(l^{\prime }_{0}\) satisfy
Then we have
Lemma 4.3
Proof
From (47) and (53), by the existence and uniqueness of BSDE, for the same perturbed control u _{0}(·), we have \((l^{'}_{0},q'_{0})=(l_{0},q_{0})\). Further, noting FBSDE (49) and (55), we get \((k(t)_{l^{'}_{0}},\bar {x}(t)_{l'_{0}})=(k(t)_{l_{0}},\bar {x}(t)_{l_{0}})\).
It follows from (48) that
Noting (55) and
and applying Gronwall’s inequality, we get (56). Using the same technique as Lemma 4.1 and noting \(\sup \limits _{0\leq t\leq T}\mathbb {E}\big |l^{'}_{0}(t)-\bar {x}(t)_{l^{'}_{0}}\big |^{2}<+\infty \), we obtain (57). □
Now, we will focus on the difference of states and cost functionals for the perturbed control and optimal control of minor agents. Given the system of \(\mathcal {A}_{i}\) under control u _{ i } for (II)
and for agent \(\mathcal {A}_{j}, 1\leq j\leq N, j\neq i\),
where \((\bar {x}(t)_{\hat {x}_{0}},k(t)_{\hat {x}_{0}})\) satisfies (17).
In order to give necessary estimates in (I) and (II), we need to introduce some intermediate states as
and for 1≤j≤N, j≠i,
where \(\check {m}^{(N-1)}(t)=\frac {1}{N-1}\sum \limits _{j=1,j\neq i}^{N}\check {m}_{j}(t)\).
Define \(m^{(N-1)}(t):=\frac {1}{N-1}\sum \limits _{j=1,j\neq i}^{N}m_{j}(t)\), \(x^{(N-1)}_{0}:=\frac {1}{N-1}\sum \limits _{j=1,j\neq i}^{N}x_{j0}\). By (51) and (61), we get
and
Then we have the following proposition.
Proposition 4.2
Proof
From (62)–(63), applying Proposition 4.1 and Gronwall’s inequality, the assertion (64) holds. (65) follows from (H1) and the L ^{2}-boundness of controls u _{ i }(·) and \(\tilde {u}_{j}(\cdot),j\neq i.\) From (63) and (17), noting \((\bar {x}(t)_{\tilde {x}_{0}},k(t)_{\tilde {x}_{0}},\tilde {x}_{0})=(\bar {x}(t)_{\hat {x}_{0}},k(t)_{\hat {x}_{0}},\hat {x}_{0})\), we get
Therefore (66) is obtained. □
Based on Proposition 4.2, we obtain more direct estimates to prove Theorem 4.1.
Lemma 4.4
For fixed i,1≤i≤N, we have
Proof
(67) follows from Proposition 4.2 directly. From (50) and (58), we get (68) by applying (67). Further, we have
In addition,
Then we have
which implies (69). □
Proof of Theorem
4.1: Now, we consider the ε-Nash equilibrium for \(\mathcal {A}_{0}\) and \(\mathcal {A}_{i},1\leq i\leq N\). Combining (42) and (57), we have
It follows from (46) and (69) that
Thus, Theorem 4.1 follows by taking \(\epsilon =O\left (\frac {1}{\sqrt {N}}\right)\). □
Conclusion and future work
In this paper, we have studied the mean-field linear-quadratic (LQ) games with major and minor agents in a backward-forward setup. The main features of our work are as follows. Unlike other mean-field game literature: (1) Here, the major and minor agents are endowed with different objective patterns: the major agent (say, the local government) aims to fulfill some prescribed future target, thus it is facing a “backward” LQ problem by minimizing the initial endowment. On the other hand, the minor agents (say, the individual producers or firms) are still facing a family of “forward” LQ problems, but their state-average is affected by the major agent’s state. (2) Accordingly, the state dynamics of the major agent satisfies some backward stochastic differential equation (BSDE) while the minor agents are modeled by some (forward) stochastic differential equations (SDEs). (3) To derive the decentralized strategies, the mean-field game is formulated in the backward-forward and the major-minor framework. An auxiliary mean-field SDE and a mixed backward-forward stochastic differential equation (BFSDE) are thus introduced and analyzed. An essential feature to BFSDE, compared to the forward-backward SDE (FBSDE), is that there is no feasible decoupling structure via the traditional Riccati equations. This feature brings some technical difficulties to our analysis and new structure to our strategies (specifically, the major’s strategy is open-looped, whereas the minors’ are still closed-looped). (4) In contrast to other mean-field games, the consistency condition is not directly analyzed via fixed-point analysis and contraction mapping. Instead, it is connected to the well-posedness of the mixed BFSDE system and is obtained under some weak monotonic conditions. The decentralized strategies are also verified to satisfy the ε-Nash equilibrium property. For this purpose, some estimates of BFSDE are applied.
In the future, one possible direction is that state-average appears in the dynamics of the major player, which may bring lots of trouble to prove the ε-Nash equilibrium property. The well-posedness of the corresponding 3×2 mixed FBSDE system is also worth research. Another direction is that the dynamics of minor players are formulated by BSDEs. In this case, the consistent condition analysis may be more complicated and technical difficulties may arise. Numerical computation and other applications in finance will also be investigated in future work.
References
Andersson, D, Djehiche, B: A maximum principle for SDEs of mean-field type, Appl. Math. Optim. 63, 341–356 (2011).
Antonelli, F: Backward-forward stochastic differential equations. Ann. Appl. Probab. 3, 777–793 (1993).
Bardi, M: Explicit solutions of some linear-quadratic mean field games. Netw. Heterog. Media. 7, 243–261 (2012).
Bensoussan, A, Sung, K, Yam, S, Yung, S: Linear-quadratic mean-field games. J. Optim. Theory Appl. 169, 496–529 (2016).
Bismut, J: An introductory approach to duality in optimal stochastic control. SIAM Rev. 20, 62–78 (1978).
Buckdahn, R, Cardaliaguet, P, Quincampoix, M: Some recent aspects of differential game theory. Dynam Games Appl. 1, 74–114 (2010).
Buckdahn, R, Djehiche, B, Li, J: A general stochastic maximum principle for SDEs of mean-field type. Appl. Math. Optim. 64, 197–216 (2011).
Buckdahn, R, Djehiche, B, Li, J, Peng, S: Mean-field backward stochastic differential equations: a limit approach. Ann. Probab. 37, 1524–1565 (2009a).
Buckdahn, R, Li, J, Peng, S: Mean-field backward stochastic differential equations and related partial differential equations, Stoch. Process. Appl. 119, 3133–3154 (2009b).
Buckdahn, R, Li, J, Peng, S: Nonlinear stochastic differential games involving a major player and a large number of collectively acting minor agents. SIAM J. Control Optim. 52, 451–492 (2014).
Carmona, R, Delarue, F: Probabilistic analysis of mean-field games. SIAM J. Control Optim. 51, 2705–2734 (2013).
Cvitanić, J, Ma, J: Hedging options for a large investor and forward-backward SDE’s. Ann. Appl. Probab. 6, 370–398 (1996).
Duffie, D, Epstein, L: Stochastic differential utility. Econometrica. 60, 353–394 (1992).
El Karoui, N, Peng, S, Quenez, M: Backward stochastic differential equations in finance. Math.Finance. 7, 1–71 (1997).
Espinosa, G, Touzi, N: Optimal investment under relative performance concerns. Math. Finance. 25, 221–257 (2015).
Guéant, O, Lasry, J-M, Lions, P-L: Mean field games and applications, Paris-Princeton lectures on mathematical finance. Springer, Berlin (2010).
Huang, M: Large-population LQG games involving a major player: the Nash certainty equivalence principle. SIAM J. Control Optim. 48, 3318–3353 (2010).
Huang, M, Caines, P, Malhamé, R: Large-population cost-coupled LQG problems with non-uniform agents: individual-mass behavior and decentralized ε-Nash equilibria. IEEE Trans. Autom. Control. 52, 1560–1571 (2007).
Huang, M, Caines, P, Malhamé, R: Social optima in mean field LQG control: centralized and decentralized strategies. IEEE Trans. Autom. Control. 57, 1736–1751 (2012).
Huang, M, Malhamé, R, Caines, P: Large population stochastic dynamic games: closed-loop McKean-Vlasov systems and the Nash certainty equivalence principle. Commun. Inf. Syst. 6, 221–251 (2006).
Hu, Y, Peng, S: Solution of forwardbackward stochastic differential equations. Proba. Theory Rel. Fields. 103, 273–283 (1995).
Lasry, J-M, Lions, P-L: Mean field games. Japan J. Math. 2, 229–260 (2007).
Li, T, Zhang, J: Asymptotically optimal decentralized control for large population stochastic multiagent systems. IEEE Trans. Autom. Control. 53, 1643–1660 (2008).
Lim, E, Zhou, XY: Linear-quadratic control of backward stochastic differential equations. SIAM J. Control Optim. 40, 450–474 (2001).
Ma, J, Protter, P, Yong, J: Solving forward-backward stochastic differential equations explicitly-a four step scheme, Proba. Theory Rel. Fields. 98, 339–359 (1994).
Ma, J, Wu, Z, Zhang, D, Zhang, J: On well-posedness of forward-backward SDEs-a unified approach. Ann. Appl. Probab. 25, 2168–2214 (2015).
Ma, J, Yong, J: Forward-Backward Stochastic Differential Equations and Their Applications. Springer-Verlag, Berlin Heidelberg (1999).
Nguyen, S, Huang, M: Linear-quadratic-Gaussian mixed games with continuum-parametrized minor players. SIAM J. Control Optim. 50, 2907–2937 (2012).
Nourian, M, Caines, P: ε-Nash mean field game theory for nonlinear stochastic dynamical systems with major and minor agents. SIAM J. Control Optim. 51, 3302–3331 (2013).
Pardoux, E, Peng, S: Adapted solution of backward stochastic equation. Syst. Control Lett. 14, 55–61 (1990).
Peng, S, Wu, Z: Fully coupled forward-backward stochastic differential equations and applications to optimal control, SIAM. J. Control Optim. 37, 825–843 (1999).
Wang, G, Wu, Z: The maximum principles for stochastic recursive optimal control problems under partial information. IEEE Trans. Autom. Control. 54, 1230–1242 (2009).
Wu, Z: A general maximum principle for optimal control of forward-backward stochastic systems. Automatica. 49, 1473–1480 (2013).
Yong, J: Finding adapted solutions of forward-backward stochastic differential equations: method of continuation. Proba. Theory Rel. Fields. 107, 537–572 (1997).
Yong, J: Optimality variational principle for controlled forward-backward stochastic differential equations with mixed initial-terminal conditions. SIAM J. Control Optim. 48, 4119–4156 (2010).
Yong, J, Zhou, XY: Stochastic Controls: Hamiltonian Systems and HJB Equations. Springer-Verlag, New York (1999).
Yu, Z: Linear-quadratic optimal control and nonzero-sum differential game of forward-backward stochastic system. Asian J.Control. 14, 173–185 (2012).
Acknowledgments
J. Huang acknowledges the financial support partly by RGC Grant 502412, 15300514, G-YL04. Z. Wu acknowledges the Natural Science Foundation of China (61573217), 111 project (B12023), the National High-level personnel of special support program and the Chang Jiang Scholar Program of Chinese Education Ministry.
Authors’ contributions
JH carried out the problem formulation and mean-field backward-forward system analysis, participated in the arguments of approximate Nash equilibrium and fixed-point analysis of certainty principle. SW carried out the deduction of consistency condition and its wellposedness analysis, participated in the draft of manuscripts. ZW carried out related maximum principle and decentralized optimality analysis, participated in the analysis of consistency condition and its connection to forward-backward stochastic system. All authors read and approved the final manuscript.
Competing interests
The authors declare that they have no competing interests.
Author information
Rights and permissions
Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.
About this article
Received
Accepted
Published
DOI
Keywords
- Backward-forward stochastic differential equation (BFSDE)
- Consistency condition
- ε-Nash equilibrium
- Large-population system
- Major-minor agent
- Mean-field game