Open Access

Backward-forward linear-quadratic mean-field games with major and minor agents

Probability, Uncertainty and Quantitative Risk20161:8

DOI: 10.1186/s41546-016-0009-9

Received: 4 April 2016

Accepted: 12 September 2016

Published: 1 December 2016


This paper studies the backward-forward linear-quadratic-Gaussian (LQG) games with major and minor agents (players). The state of major agent follows a linear backward stochastic differential equation (BSDE) and the states of minor agents are governed by linear forward stochastic differential equations (SDEs). The major agent is dominating as its state enters those of minor agents. On the other hand, all minor agents are individually negligible but their state-average affects the cost functional of major agent. The mean-field game in such backward-major and forward-minor setup is formulated to analyze the decentralized strategies. We first derive the consistency condition via an auxiliary mean-field SDEs and a 3×2 mixed backward-forward stochastic differential equation (BFSDE) system. Next, we discuss the wellposedness of such BFSDE system by virtue of the monotonicity method. Consequently, we obtain the decentralized strategies for major and minor agents which are proved to satisfy the ε-Nash equilibrium property.


Backward-forward stochastic differential equation (BFSDE) Consistency condition ε-Nash equilibrium Large-population system Major-minor agent Mean-field game


Recently, the dynamic optimization of (linear) large-population system has attracted extensive research attentions from academic communities. Its most significant feature is the existence of numerous insignificant agents, denoted by \(\{\mathcal {A}_{i}\}_{i=1}^{N},\) whose dynamics and (or) cost functionals are coupled via their state-average. To design low-complexity strategies for large-population system, one efficient method is mean-field game (MFG) which enables us to derive the decentralized strategies. Interested readers may refer to Lasry and Lions (2007), Guéant et al. (2010) for the motivation and methodology, and Andersson and Djehiche (2011), Bardi (2012), Bensoussan et al. (2016), Buckdahn et al. (2009a, b, 2010, 2011), Carmona and Delarue (2013), Huang et al. (2006, 2007, 2012), Li and Zhang (2008) for recent progress of MFG theory. Our work is to consider the following large-population system involving a major agent \(\mathcal {A}_{0}\) and minor agents \(\{\mathcal {A}_{i}\}_{i=1}^{N}\):
$${} \text{major agent}~~\mathcal{A}_{0}: \left\{ \begin{aligned} {dx}_{0}(t)=& \left[A_{0}x_{0}(t)+B_{0}u_{0}(t)+C_{0}z_{0}(t)\right]dt+z_{0}(t){dW}_{0}(t),\\ x_{0}(T)=&\xi, \end{aligned} \right. $$
$$\text{minor agent} ~~~ \mathcal{A}_{i}: \left\{ \begin{aligned} {dx}_{i}(t)\!=& \left[\!{Ax}_{i}(t)\!+{Bu}_{i}(t)+Dx^{(N)}(t)+\alpha x_{0}(t)\!\right]dt\!+\sigma {dW}_{i}(t),\\ x_{i}(0)=& x_{i0}, \end{aligned} \right. $$
where \(x^{(N)}(t)=\frac {1}{N}\sum \limits _{i=1}^{N}x_{i}(t)\) is state-average of all minor agents. Moreover, \(\mathcal {A}_{0}\) and \(\{\mathcal {A}_{i}\}_{1\leq i\leq N}\) can be further coupled via their cost functionals J 0,J i as follows:
$$ \begin{aligned} J_{0}=&\frac{1}{2} \mathbb{E}\left\{ {\int_{0}^{T}}\left[Q_{0}\left(x_{0}(t)-x^{(N)}(t)\right)^{2}+\tilde{Q}{x^{2}_{0}}(t)+R_{0}{u_{0}^{2}}(t)\right]dt+H_{0}{x_{0}^{2}}(0) \right\}, \end{aligned} $$
$$ \begin{aligned} J_{i}=&\frac{1}{2} \mathbb{E}\left\{ {\int_{0}^{T}}\left[Q\left(x_{i}(t)-x^{(N)}(t)\right)^{2}+R{u_{i}^{2}}(t)\right]dt+H{x_{i}^{2}}(T) \right\}. \end{aligned} $$

Formal assumptions on coefficients of states and costs will be given later. As addressed in (Carmona and Delarue 2013) and (Nourian and Caines 2013), the standard procedure of MFG (without \(\mathcal {A}_{0}\)) mainly consists of the following steps:

(Step i) Fix the state-average limit: \({\lim }_{N\longrightarrow +\infty } x^{(N)}\) by a frozen process \(\bar {x}\) and formulate an auxiliary stochastic control problem for \(\mathcal {A}_{i}\) which is parameterized by \(\bar {x}\).

(Step ii) Solve the above auxiliary stochastic control problem to obtain the decentralized optimal state \(\bar {x}_{i}\) (which should depend on the undetermined process \(\bar {x}\), hence denoted by \(\bar {x}_{i}(\bar {x})\)).

(Step iii) Determine \(\bar {x}\) by the fixed-point argument: \({\lim }_{N\longrightarrow +\infty } \frac {1}{N}\sum _{i=1}^{N}\bar {x}_{i} (\bar {x})=\bar {x}\).

As to the MFG with major-minor agent \((\mathcal {A}_{0}, \mathcal {A}_{i})\), Step (ii) can be further divided into:

(Step ii-a) First, solve the decentralized control problem for \(\mathcal {A}_{0}\) by replacing x (N) using \(\bar {x}.\) The related decentralized optimal state is denoted by \(\bar {x}_{0}(\bar {x})\) and optimal control by \(\bar {u}_{0}(\bar {x}).\)

(Step ii-b) Second, given \(\bar {x}_{0}(\bar {x})\) and \(\bar {u}_{0}(\bar {x})\) of \(\mathcal {A}_{0}\), solve the auxiliary stochastic control problem for \(\mathcal {A}_{i}\). The related decentralized states \(\bar {x}_{i}\) for \(\mathcal {A}_{i}\) should depend on \((\bar {x}, \bar {x}_{0}(\bar {x}))\), hence denoted by \(\bar {x}_{i}(\bar {x}, \bar {x}_{0}(\bar {x})\)).

(Step iii) is thus revised to fixed-point argument: \({\lim }_{N\longrightarrow +\infty } \frac {1}{N}\sum _{i=1}^{N}\bar {x}_{i} (\bar {x}, \bar {x}_{0}(\bar {x}))=\bar {x}.\)

The MFG with major-minor agent has been extensively studied: for example, Huang (2010) discussed MFG with a major agent and heterogenous minor agents parameterized by finite K classes; Nguyen and Huang (2012) further considered MFG with heterogenous minor agents parameterized by a continuum index set; Nourian and Caines (2013) studied MFG for nonlinear large population system involving major-minor agents; Buckdahn et al. (2014) discussed the MFG with major-minor agents in weak formulation where the “feedback control against feedback control” strategies are studied.

The modeling novelty of this paper, is to consider a major-minor agent system with backward major, namely, the state of \(\mathcal {A}_{0}\) satisfies a backward stochastic differential equation (BSDE):
$$ \left\{ \begin{aligned} {dx}_{0}(t)=& \left[A_{0}x_{0}(t)+B_{0}u_{0}(t)+C_{0}z_{0}(t)\right]dt+z_{0}(t){dW}_{0}(t).\\ x_{0}(T)=&\xi. \end{aligned} \right. $$

Unlike forward SDE with given initial condition x 0, the terminal condition ξ is pre-specified in BSDE as a priori and its solution becomes an adapted process pair (x 0,z 0). The linear BSDEs were first introduced by Bismut (1978) and the general nonlinear BSDE was first studied in Pardoux and Peng (1990). The BSDE has been applied broadly in many fields such as mathematical economics and finance, decision making and management science. One example is the representation of stochastic differential recursive utility by a class of BSDE (Duffie and Epstein (1992), El Karoui et al. (1997), Wang and Wu (2009), etc.). A BSDE coupled with a SDE in their terminal conditions formulates the forward-backward stochastic differential equation (FBSDE). The FBSDE has also been well studied and the interested readers may refer Antonelli (1993), Cvitanić and Ma (1996), Hu and Peng (1995), Ma et al. (1994, 2015), Ma and Yong (1999), Peng and Wu (1999), Wu (2013), Yong (1997, 2010), Yong and Zhou (1999), Yu (2012) and the references therein for more details of FBSDEs.

The modeling of major agent by BSDE and minor agents by forward SDE, is well motivated and can be illustrated by the following example. In a natural resource exploitation industry, there exist a large number of small exploitation firms \(\{\mathcal {A}_{i}\}_{i=1}^{N}\) which are more aggressive in their business activities. Accordingly, their cost functionals are based on forward SDEs with given initial conditions. Here, these initial conditions can be interpreted as their initial investments or deposits for exploitation licenses. On the other hand, the major agent \(\mathcal {A}_{0}\) acts as some dominating administration party such as local government or regulation bureau. As the administrator, \(\mathcal {A}_{0}\) is more conservative hence its state can be modeled by a linear BSDE for which the terminal condition is specified. Such terminal condition can be interpreted as a future target or objective such as tax revenue from exploitation industry, or environmental protection index related to natural resource.

The modeling of backward-major and forward-minors will yield a large-population system with backward-forward stochastic differential equation (BFSDE), which is structurally different to FBSDE in the following aspects. First, the forward and backward equations will be coupled in their initial instead terminal conditions. Second, unlike FBSDE, there is no feasible decoupling structure by the standard Riccati equations, as addressed in Lim and Zhou (2001). This is mainly because some implicit constraints in initial conditions should be satisfied in the possible decoupling.

The introduction of BFSDE also brings some technical differences to its MFG studies. First, as addressed in (Step i), the state-average limit of minor agents will be frozen. Then, by (ii-a), the optimal state of major agent should follow a BFSDE system. This is because the major state follows some BSDE, thus its adjoint process should be a forward SDE. These two equations will be further coupled in their initial conditions. Therefore, we will get some BFSDE instead the classical FBSDE from standard forward major-forward minor MFG. Next, as suggested by (ii-b), the given minor agent will solve some optimal control problem with augmented state: its own state, state-average limit, optimal state of major agent from (ii-a), which is a BFSDE. The minor agent’s optimal control should involve some feedback of this augmented state. In this way, the minor’s optimal state will be represented through some coupled system of its own state, the major’s agent, the state-average limit as well as one inhomogeneous equation (which is another BSDE because the state-average limit depends on major’s agent, thus it should be a random process in general). Last, as specified in (iii), taking summation of all individual minor agents’ states should reduce to the state-average limit frozen in (i). Consequently, more complicated consistency condition system should be derived in our current backward major-forward minor setup.

Based on the above step scheme, the related mean-field LQG games for backward-major and forward-minor system will be proceeded rather differently, comparing to the standard MFG analysis for forward major-minor systems. In particular, the decentralized strategies for major and minor agents will be based on a new consistency condition (see our analysis in Section “The limiting optimal control and NCE equation system”). Accordingly, a stochastic process which relates to state of major player is introduced here to approximate the state-average. An auxiliary mean-field SDE and a 3×2 FBSDE system are introduced and analyzed. Here, the 3×2 FBSDE, which is also called a triple FBSDE, comprises three forward and three backward equations. Applying the monotonic method in Peng and Wu (1999) and Yu (2012), we obtain the wellposedness of this FBSDE. In addition, the decoupling of backward-forward SDE using Riccati equation is also different to that of standard forward-backwards SDE. The ε-Nash equilibrium property of decentralized control strategy with \(\epsilon =O(1/\sqrt N)\) is also derived.

The rest of this paper is organized as follows. Section “Preliminaries and problem formulation” formulates the large population LQG games of backward-forward systems. In Section “The limiting optimal control and NCE equation system”, the limiting optimal controls of the track systems and consistency conditions are derived. Section “ ε-Nash equilibrium analysis” is devoted to the related ε-Nash equilibrium property. “Conclusion and future work section” serves as a conclusion to our study.

Preliminaries and problem formulation

Throughout this paper, we denote by \(\mathbb {R}^{m}\) the m-dimensional Euclidean space. Consider a finite time horizon [0,T] for a fixed T>0. Suppose \((\Omega, \mathcal F, \{\mathcal F_{t}\}_{0\leq t\leq T}, P)\) is a complete filtered probability space on which a standard (d+m×N)-dimensional Brownian motion {W 0(t),W i (t), 1≤iN}0≤tT is defined. We define \(\mathcal F^{w_{0}}_{t}:=\sigma \{W_{0}(s), 0\leq s\leq t\}, \mathcal F^{w_{i}}_{t}:=\sigma \{W_{i}(s), 0\leq s\leq t\}, \mathcal {F}^{i}_{t}:=\sigma \{W_{0}(s),W_{i}(s);0\leq s\leq t\}\). Here, \(\{\mathcal F^{w_{0}}_{t}\}_{0\leq t\leq T}\) represents the information of the major player, while \(\{\mathcal F^{w_{i}}_{t}\}_{0\leq t\leq T}\) the individual information of i t h minor player. For a given filtration \(\{\mathcal G_{t}\}_{0\leq t\leq T},\) let \(L^{2}_{\mathcal {G}_{t}}(0, T; \mathbb {R}^{m})\) denote the space of all \(\mathcal {G}_{t}\)-progressively measurable processes with values in \(\mathbb {R}^{m}\) satisfying \(\mathbb {E}{\int _{0}^{T}}|x(t)|^{2}dt<+\infty ; L^{2}(0, T; \mathbb {R}^{m})\) denote the space of all deterministic functions defined on [0,T] in \(\mathbb {R}^{m}\) satisfying \({\int _{0}^{T}}|x(t)|^{2}dt<+\infty ; C(0,T;\mathbb {R}^{m})\) denote the space of all continuous functions defined on [0,T] in \(\mathbb {R}^{m}\). For simplicity, in what follows we focus on the 1-dimensional processes, which means d=m=1.

Consider a large population system with (1+N) individual agents, denoted by \(\mathcal {A}_{0}\) and \(\{\mathcal {A}_{i}\}_{1 \leq i \leq N},\) where \(\mathcal {A}_{0}\) stands for the major player, while \(\mathcal {A}_{i}\) stands for i t h minor player. For sake of illustration, we restate the states of major-minor agents as follows, and give the necessary assumptions on coefficients. The dynamics of \(\mathcal {A}_{0}\) is given by a BSDE as follows:
$$ \left\{ \begin{aligned} {dx}_{0}(t)=& \left[A_{0}x_{0}(t)+B_{0}u_{0}(t)+C_{0}z_{0}(t)\right]dt+z_{0}(t){dW}_{0}(t),\\ x_{0}(T)=& \xi, \end{aligned} \right. $$
where \(\xi \in \mathcal {F}^{w_{0}}_{T}\) satisfies \(\mathbb E|\xi |^{2}<+\infty.\) The state of minor player \(\mathcal {A}_{i}\) is a SDE satisfying
$$ \left\{ \begin{aligned} {dx}_{i}(t)=& \left[{Ax}_{i}(t)+{Bu}_{i}(t)+Dx^{(N)}(t)+\alpha x_{0}(t)\right]dt+\sigma {dW}_{i}(t),\\ x_{i}(0)=& x_{i0}, \end{aligned} \right. $$
where \(x^{(N)}(t)=\frac {1}{N}\sum \limits _{i=1}^{N}x_{i}(t)\) is the state-average of minor players; x i0 is the initial value of \(\mathcal {A}_{i}\). Here, A 0,B 0,C 0,A,B,D,α,σ are scalar constants. Assume that \(\mathcal F_{t}\) is the augmentation of σ{W 0(s),W i (s),x i0;0≤st,1≤iN} by all the P-null sets of \(\mathcal {F}\), which is the full information accessible to the large population system up to time t. Let U i , i=0,1,2,…,N be subsets of \(\mathbb {R}\). The admissible control strategy \(u_{0}\in \mathcal {U}_{0},u_{i}\in \mathcal {U}_{i}\), where
$$\mathcal{U}_{0}:=\left\{u_{0}\big|u_{0}(t)\in U_{0},0\leq t\leq T;\ u_{0}(\cdot)\in L^{2}_{\mathcal{F}^{w_{0}}_{t}}(0, T; \mathbb{R})\right\}, $$
$$\mathcal{U}_{i}:=\left\{u_{i}\big|u_{i}(t)\in U_{i},0\leq t\leq T;\ u_{i}(\cdot)\in L^{2}_{\mathcal{F}_{t}}(0, T; \mathbb{R})\right\},\ 1\leq i \leq N. $$
Let u=(u 0,u 1,,u N ) denote the set of control strategies of all (1+N) agents; u −0=(u 1,u 2,,u N ) the control strategies except \(\mathcal {A}_{0}\); u i =(u 0,u 1,,u i−1,u i+1,,u N ) the control strategies except the i t h agent \(\mathcal {A}_{i},1\leq i\leq N\). The cost functional for \(\mathcal {A}_{0}\) is given by
$$ \begin{aligned} J_{0}(u_{0}(\cdot), u_{-0}(\cdot))=&\frac{1}{2} \mathbb{E}\left\{ {\int_{0}^{T}}\left[Q_{0}\left(x_{0}(t)-x^{(N)}(t)\right)^{2}+\tilde{Q}{x^{2}_{0}}(t)+R_{0}{u_{0}^{2}}(t)\right]\right.\\ &\left.dt+H_{0}{x_{0}^{2}}(0){\vphantom{ {\int_{0}^{T}}\left[Q_{0}\left(x_{0}(t)-x^{(N)}(t)\right)^{2}+\tilde{Q}{x^{2}_{0}}(t)+R_{0}{u_{0}^{2}}(t)\right]}} \right\}, \end{aligned} $$
where \(Q_{0}\geq 0, \tilde {Q}\geq 0, R_{0}>0, H_{0}\geq 0\). The individual cost functional for \(\mathcal {A}_{i}, 1\leq i\leq N\), is
$$ \begin{aligned} J_{i}(u_{i}(\cdot), u_{-i}(\cdot))=&\frac{1}{2} \mathbb{E}\left\{ {\int_{0}^{T}}\left[Q\left(x_{i}(t)-x^{(N)}(t)\right)^{2}+R{u_{i}^{2}}(t)\right]dt+H{x_{i}^{2}}(T) \right\}, \end{aligned} $$

where Q≥0,R>0,H≥0.

Remark 2.1

Unlike (Huang 2010;Nguyen and Huang 2012;Nourian and Caines 2013), the dynamics of the major agent in our work is a BSDE with a terminal condition as a priori. The term \(H_{0}{x_{0}^{2}}(0)\) is thus introduced in (3) to represent some recursive evaluation. One of its practical implications is the initial hedging deposit in the pension fund industry. For the sake of simplicity, behaviors of the major agent (e.g., the government, as presented in the example above) affect the state of minor agents (which can be understood as numerous individual and negligible firms or producers). Moreover, the major and minor agents are further coupled via the state-average.

Remark 2.2

The cost functional (3) takes some linear combination weighted by Q 0 and \(\tilde {Q}.\) Regarding this point, (3) enables us to represent some trade-off between the absolute quadratic cost \({x^{2}_{0}}(t)\) and relative quadratic deviation (x 0(t)−x (N)(t))2. This functional combination can be interpreted as some balance between the minimization of its own cost and the benchmark index tracking to the minor agents’ average. Moreover, such tracking can be framed into the relative performance setting. Similar work can be found in Espinosa and Touzi (2015), where the relative performance is formulated by some convex combination \(\lambda \left (x_{i}(t)-x^{(N)}(t)\right)^{2}+(1-\lambda) {x^{2}_{0}}(t), \lambda \in [0,1]\).

We introduce the following assumption: (H1) \(\{x_{i0}\}_{i=1}^{N}\) are independent and identically distributed (i.i.d) with \(\mathbb {E}x_{i0}=x\), \(\mathbb {E}|x_{i0}|^{2}<+\infty,\) and also independent of {W 0,W i ,1≤iN}.

It follows that (1) admits a unique solution for all \(u_{0} \in \mathcal {U}_{0}\), (see Pardoux and Peng (1990)). It is also well known that under (H1), (2) admits a unique solution for all \(u_{i} \in \mathcal {U}_{i}, 1\leq i\leq N\). Now, we formulate the large population dynamic optimization problem.

Problem (I). Find a control strategies set \(\bar {u}=(\bar {u}_{0},\bar {u}_{1},\cdots,\bar {u}_{N})\) which satisfies
$$J_{i}(\bar{u}_{i}(\cdot),\bar{u}_{-i}(\cdot))=\inf_{u_{i}\in \mathcal{U}_{i}}J_{i}(u_{i}(\cdot),\bar{u}_{-i}(\cdot)),\ 0\leq i\leq N, $$
where \(\bar {u}_{-0}\) represents \((\bar {u}_{1},\bar {u}_{2},\cdots, \bar {u}_{N})\) and \(\bar {u}_{-i}\) represents \((\bar {u}_{0},\bar {u}_{1},\cdots,\bar {u}_{i-1}, \bar {u}_{i+1},\cdots, \bar {u}_{N})\), for 1≤iN.

The limiting optimal control and NCE equation system

Combining the major’s state with forcing equation (BSDE with null terminal condition), we naturally have the following formulation of limit representation. To obtain the feedback control and the desired results, we assume \(U_{i}=\mathbb {R}\) for i=0,1,2,…,N.

Suppose x (N)(·) is approximated by \(\bar {x}(\cdot)\) as N→+. Introduce the following auxiliary dynamics of major and minor players, still denoted by x 0(·),x i (·), respectively:
$$ \left\{ \begin{aligned} &{dx}_{0}(t)= \left[A_{0}x_{0}(t)+B_{0}u_{0}(t)+C_{0}z_{0}(t)\right]dt+z_{0}(t){dW}_{0}(t),\\ &x_{0}(T)= \xi,\\ &d\bar{x}(t)=\left[\bar{A}(t)\bar{x}(t)+\bar{B}(t)x_{0}(t)+\bar{C}(t)k(t)\right]dt,\\ &\bar{x}(0)=x,\\ &dk(t)=\left[\tilde{A}(t)k(t)+\tilde{B}(t)\bar{x}(t)+\tilde{C}(t)x_{0}(t)\right]dt+\theta(t){dW}_{0}(t),\\ &k(T)=0 \end{aligned} \right. $$
$$ \left\{ \begin{aligned} {dx}_{i}(t)=& \left[{Ax}_{i}(t)+{Bu}_{i}(t)+D\bar{x}(t)+\alpha x_{0}(t)\right]dt+\sigma {dW}_{i}(t),\\ x_{i}(0)=& x_{i0}. \end{aligned} \right. $$
Note that the coefficients \((\bar {A}(\cdot),\bar {B}(\cdot),\bar {C}(\cdot),\tilde {A}(\cdot),\tilde {B}(\cdot),\tilde {C}(\cdot))\in L^{2}(0,T;\mathbb {R}^{6})\) are still to be determined. The associated limiting cost functionals become
$$ \begin{aligned} \bar{J}_{0}(u_{0}(\cdot))=&\frac{1}{2} \mathbb{E}\left\{ {\int_{0}^{T}}\left[Q_{0}\left(x_{0}(t)-\bar{x}(t)\right)^{2}+\tilde{Q}{x^{2}_{0}}(t)+R_{0}{u_{0}^{2}}(t)\right]dt+H_{0}{x_{0}^{2}}(0) \right\} \end{aligned} $$
$$ \begin{aligned} \bar{J}_{i}(u_{i}(\cdot))=&\frac{1}{2} \mathbb{E}\left\{ {\int_{0}^{T}}\left[Q\left(x_{i}(t)-\bar{x}(t)\right)^{2}+R{u_{i}^{2}}(t)\right]dt+H{x_{i}^{2}}(T) \right\}. \end{aligned} $$

Thus, we formulate the limiting LQG game (II) as follows.

Problem (II). For i t h agent \(\mathcal {A}_{i}\), i=0,1,2,,N, find \(\bar {u}_{i}\in \mathcal {U}_{i}\) satisfying
$$ \bar{J}_{i}(\bar{u}_{i}(\cdot))=\inf_{u_{i}\in \mathcal{U}_{i}}\bar{J}_{i}(u_{i}(\cdot)). $$

\(\bar {u}_{i}\) satisfying (9) is called an optimal control for (II).

Remark 3.1

Since \(\bar {x}(t)\) is regarded as the approximated process of state average x (N)(t), we replace x (N)(t) by \(\bar {x}(t)\) in Problem (II). In what follows, (II) is called the limiting problem of (I) as N→+. As referred to at the beginning of this section, we are going to deal with this limiting problem first. Then, we will focus on the ε−Nash equilibrium between (I) and (II), which is the biggest difference with the usual Nash equilibrium problem.

Remark 3.2

By noting that each minor player’s state x i (t) in (2) depends on the major player’s state x 0(t) explicitly, we claim that the limiting process \(\bar {x}(t)\) also depends on x 0(t) explicitly. In fact, the third process k(t) is also meaningful, which is a stochastic process introduced in decoupling the Hamilton system. Hereinafter, we will show it.

Remark 3.3

Since the state-average of minor players appears only in the cost functional of the major player, the first equation in (5) has the same form as (1), actually. However, for regularity, we still write it out.

To get the optimal control of Problem (II), we should obtain the optimal control of \(\mathcal {A}_{0}\) first. We have the following lemma.

Lemma 3.1

Corresponding to the forward-backward system (5) and (7), the optimal control of \(\mathcal {A}_{0}\) for (II) is given by
$$ \bar{u}_{0}(t)=-B_{0}R_{0}^{-1}p_{0}(t), $$
where the adjoint process p 0(·)and the corresponding optimal trajectory \((\hat {x}_{0}(\cdot),\hat {z}_{0}(\cdot))\) satisfy the following Hamilton system
$$ \left\{ \begin{aligned} &d\hat{x}_{0}(t)= \left[A_{0}\hat{x}_{0}(t)-{B_{0}^{2}}R_{0}^{-1}p_{0}(t)+C_{0}\hat{z}_{0}(t)\right]dt+\hat{z}_{0}(t){dW}_{0}(t),\\ &d\bar{x}(t)=\left[\bar{A}(t)\bar{x}(t)+\bar{B}(t)\hat{x}_{0}(t)+\bar{C}(t)k(t)\right]dt,\\ &dk(t)=\left[\tilde{A}(t)k(t)+\tilde{B}(t)\bar{x}(t)+\tilde{C}(t)\hat{x}_{0}(t)\right]dt+\theta(t){dW}_{0}(t),\\ &{dp}_{0}(t)=\left[-A_{0}p_{0}(t)-Q_{0}(\hat{x}_{0}(t)-\bar{x}(t))-\tilde{Q}\hat{x}_{0}(t)-\bar{B}(t)p(t)-\tilde{C}(t)q(t)\right]dt\\ &\qquad\qquad-C_{0}p_{0}(t){dW}_{0}(t),\\ &dp(t)=\left[-\bar{A}(t)p(t)+Q_{0}(\hat{x}_{0}(t)-\bar{x}(t))-\tilde{B}(t)q(t)\right]dt+\bar{\theta}(t){dW}_{0}(t),\\ &dq(t)=\left(-\tilde{A}(t)q(t)-\bar{C}(t)p(t)\right)dt,\\ &\hat{x}_{0}(T)= \xi,\ \bar{x}(0)=x,\ k(T)=0,\ p_{0}(0)=-H_{0}\hat{x}_{0}(0),\ p(T)=0, \ q(0)=0, \end{aligned} \right. $$

where \(\theta (\cdot),\bar {\theta }(\cdot)\in L^{2}_{\mathcal {F}^{w_{0}}}(0, T; \mathbb {R})\).


For the variation of control \(\delta u_{0}(\cdot)\in L^{2}_{\mathcal {F}^{w_{0}}}(0, T; \mathbb {R})\), which is an arbitrary control process such that \(u_{0}(\cdot)=\bar u_{0}(\cdot)+\delta \cdot \delta u_{0}(\cdot)\in L^{2}_{\mathcal {F}^{w_{0}}}(0, T; \mathbb {R})\), introduce the following variational equations:
$$ \left\{ \begin{aligned} &d\delta x_{0}(t)= \left[A_{0}\delta x_{0}(t)+B_{0} \delta u_{0}(t)+C_{0}\delta z_{0}(t)\right]dt+\delta z_{0}(t){dW}_{0}(t),\\ &d\delta\bar{x}(t)=\left[\bar{A}(t)\delta\bar{x}(t)+\bar{B}(t)\delta x_{0}(t)+\bar{C}(t)\delta k(t)\right]dt,\\ &d\delta k(t)=\left[\tilde{A}(t)\delta k(t)+\tilde{B}(t)\delta\bar{x}(t)+\tilde{C}(t)\delta x_{0}(t)\right]dt+\delta\theta(t){dW}_{0}(t),\\ &\delta x_{0}(T)= 0,\ \delta\bar{x}(0)=0,\ \delta k(T)=0. \end{aligned} \right. $$
Applying Itô’s formula to \(p_{0}(t)\delta x_{0}(t)+p(t) \delta \bar {x}(t)+q(t)\delta k(t)\) and noting the associated first-order variation of cost functional:
$$\begin{aligned} 0=\,&\delta\bar{J}_{0}(\bar{u}_{0}):=\frac{d}{d\delta}\bar J_{0}(\bar u_{0}+\delta\cdot\delta u_{0})|_{\delta=0}\\ =\,&\mathbb{E}\left\{ {\int_{0}^{T}}\left[Q_{0}\left(\hat{x}_{0}(t)-\bar{x}(t)\right)\left(\delta x_{0}(t)-\delta\bar{x}(t)\right)+\tilde{Q}\hat{x}_{0}(t)\delta x_{0}(t)+R_{0}\bar{u}_{0}(t)\delta u_{0}(t)\right]\right.\\ &\left.dt+H_{0}\hat{x}_{0}(0)\delta x_{0}(0){\vphantom{{\int_{0}^{T}}\left[Q_{0}\left(\hat{x}_{0}(t)-\bar{x}(t)\right)\left(\delta x_{0}(t)-\delta\bar{x}(t)\right)+\tilde{Q}\hat{x}_{0}(t)\delta x_{0}(t)+R_{0}\bar{u}_{0}(t)\delta u_{0}(t)\right]}}\right\}, \end{aligned} $$
we obtain the optimal control (10). Combining all state equations and adjoint equations, and applying \(\bar {u}_{0}(\cdot)\) to \(\mathcal {A}_{0}\), we get the Hamilton system (11). □

After obtaining the optimal control of major player \(\mathcal {A}_{0}\), in what follows we aim to get the optimal control \(\bar {u}_{i}\) of minor player \(\mathcal {A}_{i}\), with corresponding optimal trajectory \(\hat {x}_{i}(\cdot)\).

Lemma 3.2

Under (H1), the optimal control of \(\mathcal {A}_{i}\) for (II) is
$$ \bar{u}_{i}(t)=-BR^{-1}p_{i}(t), $$
where the adjoint process p i (·)and the corresponding optimal trajectory \(\hat {x}_{i}(\cdot)\) satisfy BSDE
$$ \left\{ \begin{aligned} {dp}_{i}(t)=& \left[-{Ap}_{i}(t)-Q\left(\hat{x}_{i}(t)-\bar{x}(t)\right)\right]dt+\theta_{0}(t){dW}_{0}(t)+\theta_{i}(t){dW}_{i}(t),\\ p_{i}(T)=& H\hat{x}_{i}(T) \end{aligned} \right. $$
and SDE
$$ \left\{ \begin{aligned} d\hat{x}_{i}(t)=& \left[A\hat{x}_{i}(t)-B^{2}R^{-1}p_{i}(t)+D\bar{x}(t)+\alpha \hat{x}_{0}(t)\right]dt+\sigma(t){dW}_{i}(t),\\ \hat{x}_{i}(0)=& x_{i0}. \end{aligned} \right. $$

Here \(\theta _{0}(\cdot),\theta _{i}(\cdot)\in L^{2}_{\mathcal {F}^{i}}(0, T; \mathbb {R})\); \(\hat {x}_{0}(\cdot)\), and \(\bar {x}(\cdot)\) are given by (11). The proof is similar to that of Lemma 3.1 and omitted. For the coupled BFSDE (14) and (15), we are going to decouple it and try to derive the Nash certainty equivalence (NCE) system satisfied by the decentralized control policy. Then we have the following lemma.

Lemma 3.3

Suppose P(·) is the unique solution of the following Riccati equation
$$ \left \{ \begin{aligned} &\dot{P}(t)+2AP(t)-B^{2}R^{-1}P^{2}(t)+Q=0,\\ &P(T)=H, \end{aligned} \right. $$
then we obtain the following Hamilton system:
$${} \left\{ \begin{aligned} &d\hat{x}_{0}(t)= \left[A_{0}\hat{x}_{0}(t)-{B_{0}^{2}}R_{0}^{-1}p_{0}(t)+C_{0}\hat{z}_{0}(t)\right]dt+\hat{z}_{0}(t){dW}_{0}(t),\\ &d\bar{x}(t)=\left[\big(A+D-B^{2}R^{-1}P(t)\big)\bar{x}(t)-B^{2}R^{-1}k(t)+\alpha\hat{x}_{0}(t)\right]dt,\\ &dk(t)=\left[\left(-A+B^{2}R^{-1}P(t)\right)k(t)+\left(Q-DP(t)\right)\bar{x}(t)-\alpha P(t)\hat{x}_{0}(t)\right]\\&\quad\quad\quad\quad dt+\theta_{0}(t){dW}_{0}(t),\\ &{dp}_{0}(t)=\left[-A_{0}p_{0}(t)-Q_{0}(\hat{x}_{0}(t)-\bar{x}(t))-\tilde{Q}\hat{x}_{0}(t)-\alpha p(t)+\alpha P(t)q(t)\right]dt\\ &\qquad\qquad-C_{0}p_{0}(t){dW}_{0}(t),\\ &dp(t)=\!\left[\,-\,\left(\!A+D-B^{2}R^{-1}P(t)\!\right)\!p(t)+\!Q_{0}(\hat{x}_{0}(t)\!-\bar{x}(t))-\!(Q\,-\,DP(t))q(t)\!\right]dt\\ &\qquad\qquad+\bar{\theta}(t){dW}_{0}(t),\\ &dq(t)=\left[\left(A-B^{2}R^{-1}P(t)\right)q(t)+B^{2}R^{-1}p(t)\right]dt,\\ &\hat{x}_{0}(T)= \xi,\ \bar{x}(0)=x,\ k(T)=0,\ p_{0}(0)=-H_{0}\hat{x}_{0}(0),\ p(T)=0, \ q(0)=0, \end{aligned} \right. $$

which is a 3×2 FBSDE.


$$p_{i}(t)=P_{i}(t)\hat{x}_{i}(t)+f_{i}(t),~1\leq i\leq N, $$
where P i (·),f i (·) are to be determined. Here, P i (·) is differentiable and f i (·) is an Itô process. The terminal condition \(p_{i}(T)= H\hat {x}_{i}(T)\) implies that
$$P_{i}(T)=H,~ f_{i}(T)=0. $$
Applying Itô’s formula to \(P_{i}(t)\hat {x}_{i}(t)+f_{i}(t)\), we have
$$\begin{aligned} {dp}_{i}(t)=&\left[\dot{P}_{i}(t)+{AP}_{i}(t)-B^{2}R^{-1}{P^{2}_{i}}(t)\right]\hat{x}_{i}(t)dt\\ &+ \left[{DP}_{i}(t)\bar{x}(t)-B^{2}R^{-1}P_{i}(t)f_{i}(t)+\alpha P_{i}(t)\hat{x}_{0}(t)\right]dt\\&+{df}_{i}(t)+\sigma P_{i}(t){dW}_{i}(t). \end{aligned} $$
Comparing the coefficients with (14), we get θ i (t)=σ P i (t),
$$ \left \{ \begin{aligned} &\dot{P}_{i}(t)+2{AP}_{i}(t)-B^{2}R^{-1}{P^{2}_{i}}(t)+Q=0,\\ &P_{i}(T)=H \end{aligned} \right. $$
$$ \left \{ \begin{aligned} {df}_{i}(t)=&\left[\left(-A+B^{2}R^{-1}P_{i}(t)\right)f_{i}(t)+\left(Q-{DP}_{i}(t)\right)\bar{x}(t)-\alpha P_{i}(t)\hat{x}_{0}(t)\right]dt\\ &+\theta_{0}(t){dW}_{0}(t),\\ f_{i}(T)=&0. \end{aligned} \right. $$

Noting that Riccati Eq. (18) is symmetric, it is well known that (18) admits a unique nonnegative bounded solution P i (·) (see (Ma and Yong 1999)). Further we get that P 1(·)=P 2(·)==P N (·):=P(·). Thus, (18) coincides with (16). Besides, for given \(\bar {x}(\cdot),\hat {x}_{0}(\cdot)\in L^{2}_{\mathcal F^{w_{0}}}(0,T; \mathbb {R})\), the linear BSDE (19) admits a unique solution \(f_{i}(\cdot)\in L^{2}_{\mathcal {F}^{w_{0}}}(0, T; \mathbb {R})\). We denote f i (·):=f(·),i=1,2,,N.

Therefore, the decentralized feedback strategy for \(\mathcal {A}_{i},1\leq i\leq N\) is written as
$$ u_{i}(t)=-BR^{-1}\left(P(t)x_{i}(t)+f(t)\right), $$
where x i (·) is the state of minor player \(\mathcal {A}_{i}\). Plugging (20) into (2) implies the centralized closed-loop state:
$${} \left\{ \begin{aligned} \!{dx}_{i}(t)\!=& \left[\!\left(A\,-\,B^{2}R^{-1}\!P(t)\!\right)x_{i}(t)\,-\,B^{2}R^{-1}f\!(t)\,+\,Dx^{(N)}(t)+\alpha x_{0}(t)\!\right]\!dt\!+\sigma {dW}_{i}(t),\\ x_{i}(0)=& x_{i0}. \end{aligned} \right. $$
Taking the summation, dividing by N, and letting N→+, we get
$$ \left\{ \begin{aligned} d\bar{x}(t)=& \left[\left(A+D-B^{2}R^{-1}P(t)\right)\bar{x}(t)-B^{2}R^{-1}f(t)+\alpha x_{0}(t)\right]dt,\\ \bar{x}(0)=& x. \end{aligned} \right. $$
Comparing the coefficients with the second equation of (5), we have
$$ \begin{aligned} \bar{A}(\cdot)=A+D-B^{2}R^{-1}P(\cdot),~\bar{B}(\cdot)=\alpha,~ \bar{C}(\cdot)=-B^{2}R^{-1},~ k(\cdot)=f(\cdot). \end{aligned} $$
Then we obtain
$$\left \{ \begin{aligned} dk(t)=&\left[\left(-A+B^{2}R^{-1}P(t)\right)k(t)+\left(Q-DP(t)\right)\bar{x}(t)-\alpha P(t)x_{0}(t)\right]\\&\quad dt+\theta_{0}(t){dW}_{0}(t),\\ k(T)=&0. \end{aligned} \right. $$
Noting the third equation of (5), it follows that
$$ \begin{aligned} \tilde{A}(\cdot)=-A+B^{2}R^{-1}P(\cdot),~\tilde{B}(\cdot)=Q-DP(\cdot),~\tilde{C}(\cdot)=-\alpha P(\cdot),\ \ \theta(\cdot)=\theta_{0}(\cdot). \end{aligned} $$

Then (17) is obtained, which completes the proof. □

Remark 3.4

The proof of Lemma 3.3 implies that k(·)=f(·). Thus, k(·), which is first introduced in (5), has some specific meaning that it is indeed a force function when decoupling (14) and (15).

To get the wellposedness of (17), we give the following assumption. (H2) \(B_{0}\neq 0,\ H_{0}>0,\ \tilde {Q}>0.\)

Theorem 3.1

Under (H2), FBSDE (17) is uniquely solvable.



It is easily checked that (16) admits a unique nonnegative bounded solution (see (Ma and Yong 1999)). For the sake of notational convenience, in (17) we denote by b(ϕ),σ(ϕ) the coefficients of drift and diffusion terms, respectively, for \(\phi =p_{0},\bar {x},q\); denote by f(ψ) the generator for \(\psi =\hat {x}_{0},p,k\).

Define \(\Delta :=(p_{0},\bar {x},q,\hat {x}_{0},p,k,\hat {z}_{0},\bar {\theta },\theta _{0})\), similar to the notation in (Peng and Wu 1999), we denote by
$$\mathbb{A}(t,\Delta):=\left(-f(\hat{x}_{0}),-f(p),-f(k),b(p_{0}),b(\bar{x}),b(q),\sigma(p_{0}),\sigma(\bar{x}),\sigma(q)\right), $$

which implies \(\mathbb {A}(t,\Delta)=\left (A_{0}\hat {x}_{0}-{B_{0}^{2}}R_{0}^{-1}p_{0}+C_{0}\hat {z}_{0},-\left (A+D-B^{2}R^{-1}P(t)\right)\right. p+Q_{0}(\hat {x}_{0}-\bar {x})-\left (Q-DP(t)\right)q,\left (-A+B^{2}R^{-1}P(t)\right)k+\left (Q-DP(t)\right)\bar {x}- \alpha P(t)\hat {x}_{0},-A_{0}p_{0}-Q_{0}(\hat {x}_{0}-\bar {x})-\tilde {Q}\hat {x}_{0}-\alpha p+\alpha P(t)q,\left (A+D-B^{2}R^{-1}P(t)\right)\bar {x}- \left. B^{2}R^{-1}k+\alpha \hat {x}_{0},\left (A-B^{2}R^{-1}P(t)\right)q+B^{2}R^{-1}p,-C_{0}p_{0},0,0{\vphantom {A_{0}\hat {x}_{0}-{B_{0}^{2}}R_{0}^{-1}p_{0}+C_{0}\hat {z}_{0},-(A+D-B^{2}R^{-1}P(t)}}\right).\)

Then for any \(\Delta ^{i}=({p_{0}^{i}},\bar {x}^{i},q^{i},\hat {x}_{0}^{i},p^{i},k^{i},\hat {z}_{0}^{i},\bar {\theta }^{i},{\theta ^{i}_{0}}),i=1,2,\) we have
$$\begin{aligned} &\langle\mathbb{A}(t,\Delta^{1})-\mathbb{A}(t,\Delta^{2}),\Delta^{1}-\Delta^{2}\rangle\\ =&-{B_{0}^{2}}R_{0}^{-1}({p_{0}^{1}}-{p_{0}^{2}})^{2}-Q_{0}\left[(\bar{x}^{1}-\bar{x}^{2})-(\hat{x}_{0}^{1}-\hat{x}_{0}^{2})\right]^{2}-\tilde{Q}(\hat{x}_{0}^{1}-\hat{x}_{0}^{2})^{2}\\ \leq &-{B_{0}^{2}}R_{0}^{-1}({p_{0}^{1}}-{p_{0}^{2}})^{2}-\tilde{Q}(\hat{x}_{0}^{1}-\hat{x}_{0}^{2})^{2}\\ :=&-\beta_{1}({p_{0}^{1}}-{p_{0}^{2}})^{2}-\beta_{2}(\hat{x}_{0}^{1}-\hat{x}_{0}^{2})^{2}. \end{aligned} $$
In the following, we are first going to show that (17) admits at most one adapted solution. Suppose Δ and \(\Delta ^{'}=(p'_{0},\bar {x}',q',\hat {x}'_{0},p',k',\hat {z}^{'}_{0},\bar {\theta }^{'},\theta ^{'}_{0})\) are two solutions of (17). Setting \(\hat {\Delta }=(\hat {p}_{0},\hat {\bar {x}},\hat {q},\hat {\hat {x}}_{0},\hat {p},\hat {k},\hat {\hat {z}}_{0},\hat {\bar {\theta }},\hat {\theta }_{0}) =(p_{0}-p'_{0},\bar {x}-\bar {x}',q-q',\hat {x}_{0}-\hat {x}'_{0},p-p',k-k',\hat {z}_{0}-\hat {z}'_{0},\bar {\theta }-\bar {\theta }^{'},\theta _{0}-\theta ^{'}_{0})\) and applying Itô’s formula to \(\langle \hat {p}_{0},\hat {\hat {x}}_{0}\rangle +\langle \hat {\bar {x}},\hat {p}\rangle +\langle \hat {q},\hat {k}\rangle \), we have
$$\begin{aligned} -\mathbb{E}\langle\hat{p}_{0}(0),\hat{x}_{0}(0)\rangle&= \mathbb{E}{\int_{0}^{T}}\langle\mathbb{A}(s,\Delta)-\mathbb{A}(s,\Delta^{'}),\hat{\Delta}\rangle ds\\ &\leq -\beta_{1} \mathbb{E}{\int_{0}^{T}}(p_{0}(s)-p^{'}_{0}(s))^{2}ds-\beta_{2}\mathbb{E}{\int_{0}^{T}}(\hat{x}_{0}(s)-\hat{x}^{'}_{0}(s))^{2} ds. \end{aligned} $$
It follows that
$$\begin{aligned} \beta_{1} \mathbb{E}{\int_{0}^{T}}|\hat{p}_{0}(s)|^{2}ds+\beta_{2}\mathbb{E}{\int_{0}^{T}}\big|\hat{\hat{x}}_{0}(s)\big|^{2}ds+H_{0}\mathbb{E}\Big|\hat{x}_{0}(0)\Big|^{2}\leq0. \end{aligned} $$

By (H2), we get β 1>0 and β 2>0. Then \(\hat {p}_{0}(s)\equiv 0\), \(\hat {x}_{0}(s)\equiv 0\). Further \(\hat {\hat {z}}_{0}(s)\equiv 0\). Applying the basic technique to \(\hat {\bar {x}}(s)\) and \(\hat {k}(s)\), and using Gronwall’s inequality, we obtain \(\hat {\bar {x}}(s)\equiv 0\), \(\hat {k}(s)\equiv 0\) and \(\hat {\theta }_{0}(s)\equiv 0\). Similarly, we have \(\hat {q}(s)\equiv 0\), \(\hat {p}(s)\equiv 0\), and \(\hat {\bar {\theta }}(s)\equiv 0\). Therefore, (17) admits at most one adapted solution.

Existence. In order to prove the existence of the solution, we first consider the following family of FBSDEs parameterized by γ[0,1]:
$${} \left\{ \begin{aligned} &{dp}_{0}^{\gamma}(t)=\left[-(1-\gamma)\hat{x}_{0}^{\gamma}(t)\beta_{2}+\gamma b(p_{0}^{\gamma})+{\varphi^{1}_{t}}\right]dt+\left[\gamma\sigma(p_{0}^{\gamma})+\lambda_{t}\right]{dW}_{0}(t),\\ &d\hat{x}_{0}^{\gamma}(t)= \left[-(1-\gamma)p_{0}^{\gamma}(t)\beta_{1}-\gamma f(\hat{x}_{0}^{\gamma})+{\kappa^{1}_{t}}\right]dt+\hat{z}_{0}^{\gamma}(t){dW}_{0}(t),\\ &d\bar{x}^{\gamma}(t)=\left[\gamma b(\bar{x}^{\gamma})+{\varphi^{2}_{t}}\right]dt,\\ &dp^{\gamma}(t)=\left[-\gamma f(p^{\gamma})+{\kappa^{2}_{t}}\right]dt+\bar{\theta}^{\gamma}(t){dW}_{0}(t),\\ &dq^{\gamma}(t)=\left[\gamma b(q^{\gamma})+{\varphi^{3}_{t}}\right]dt,\\ &dk^{\gamma}(t)=\left[-\gamma f(k^{\gamma})+{\kappa^{3}_{t}}\right]dt+\theta_{0}^{\gamma}(t){dW}_{0}(t),\\ &\!p_{0}^{\gamma}(0)=-(1-\gamma)\hat{x}_{0}^{\gamma}(0)-\gamma H_{0}\hat{x}_{0}^{\gamma}(0)+a, \hat{x}_{0}^{\gamma}(T)=\gamma \xi,\ \bar{x}^{\gamma}(0)\!=\gamma x,\! p^{\gamma}(T)\!=0, \\ &q^{\gamma}(0)=0, \ k^{\gamma}(T)=0, \end{aligned} \right. $$

where \((\varphi ^{1},\varphi ^{2},\varphi ^{3},\lambda,\kappa ^{1},\kappa ^{2},\kappa ^{3})\in L^{2}_{\mathcal F^{w_{0}}}(0,T;\mathbb {R}^{7})\), \(a\in L^{2}(\Omega,\mathcal {F}^{w_{0}}_{0},P;\mathbb {R})\). Clearly, when γ=1, the existence of (23) implies that of (17). When γ=0, it is easy to obtain that (23) admits a unique solution (actually, the 2-dim FBSDE is very similar to the Hamiltonian system of (Lim and Zhou 2001)).

If, a priori, for each \(\left (\varphi ^{1},\varphi ^{2},\varphi ^{3},\lambda,\kappa ^{1},\kappa ^{2},\kappa ^{3}\right)\in L^{2}_{\mathcal {F}^{w_{0}}}(0,T;\mathbb {R}^{7})\) and a certain number γ 0[0,1) there exists a unique tuple \((p_{0}^{\gamma _{0}},\bar {x}^{\gamma _{0}},q^{\gamma _{0}},\hat {x}_{0}^{\gamma _{0}},p^{\gamma _{0}},k^{\gamma _{0}}, \hat {z}_{0}^{\gamma _{0}},\bar {\theta }^{\gamma _{0}},\theta _{0}^{\gamma _{0}})\) of (23), then for each
$$u_{s}=\left(p_{0}(s),\bar{x}(s),q(s),\hat{x}_{0}(s),p(s),k(s),\hat{z}_{0}(s),\bar{\theta}(s),\theta_{0}(s)\right)\in L^{2}_{\mathcal{F}^{w_{0}}_{s}}\left(0,T;\mathbb{R}^{9}\right), $$
there exists a unique tuple \(U_{s}\!=(\!P_{0}(s),\bar {X}(s),Q(s),\hat {X}_{0}(s),P(s),K(s),\hat {Z}_{0}(s),\bar {\Theta }(s), \Theta _{0}(s))\in L^{2}_{\mathcal {F}^{w_{0}}_{s}}(0,T; \mathbb {R}^{9})\) satisfying the following FBSDEs
$$ \left\{ \begin{aligned} &{dP}_{0}(t)=\left[-(1-\gamma_{0})\hat{X}_{0}(t)\beta_{2}+\gamma_{0} b(P_{0})+\delta(\hat{x}_{0}(t)\beta_{2}+b(p_{0}))+{\varphi^{1}_{t}}\right]dt\\ &\qquad\qquad+\left[\gamma_{0}\sigma(P_{0})+\lambda_{t}\right]{dW}_{0}(t),\\ &d\hat{X}_{0}(t)= \left[-(1-\gamma_{0})P_{0}(t)\beta_{1}-\gamma_{0} f(\hat{X}_{0})+\delta(p_{0}(t)\beta_{1}-f(\hat{x}_{0}))+{\kappa^{1}_{t}}\right]\\&\qquad\qquad dt+\hat{Z}_{0}(t){dW}_{0}(t),\\ &d\bar{X}(t)=\left[\gamma_{0} b(\bar{X})+\delta b(\bar{x})+{\varphi^{2}_{t}}\right]dt,\\ &dP(t)=\left[-\gamma_{0} f(P)-\delta f(p)+{\kappa^{2}_{t}}\right]dt+\bar{\Theta}(t){dW}_{0}(t),\\ &dQ(t)=\left[\gamma_{0} b(Q)+\delta b(q)+{\varphi^{3}_{t}}\right]dt,\\ &dK(t)=\left[-\gamma_{0} f(K)-\delta f(k)+{\kappa^{3}_{t}}\right]dt+\Theta_{0}(t){dW}_{0}(t),\\ &P_{0}(0)=-(1-\gamma_{0})\hat{X}_{0}(0)-\gamma_{0} H_{0}\hat{X}_{0}(0)+\delta (1- H_{0})\hat{x}_{0}(0)+a,\ \hat{X}_{0}(T)\\&\,\,\quad\quad=\gamma_{0} \xi+\delta\xi,\\ &\bar{X}(0)=\gamma_{0} x+\delta x,\ P(T)=0, \ Q(0)=0, \ K(T)=0. \end{aligned} \right. $$
In the following, we aim to prove that the mapping defined by
$$\begin{aligned} I_{\gamma_{0}+\delta}(u\times \hat{x}_{0}(0))&=U\times \hat{X}_{0}(0):L^{2}_{\mathcal F^{w_{0}}}(0,T;\mathbb{R}^{9})\times L^{2}(\Omega,\mathcal{F}^{w_{0}}_{0},P)\rightarrow\\ &\quad L^{2}_{\mathcal F^{w_{0}}}(0,T;\mathbb{R}^{9})\times L^{2}(\Omega,\mathcal{F}^{w_{0}}_{0},P) \end{aligned} $$
is a contraction.
Introduce \(u'=(p'_{0},\bar {x}',q',\hat {x}^{'}_{0},p',k',\hat {z}^{'}_{0},\bar {\theta }',\theta ^{'}_{0})\in L^{2}_{\mathcal F^{w_{0}}}(0,T;\mathbb {R}^{9})\), \(U'\times \hat {X}'_{0}(0)=I_{\gamma _{0}+\delta }(u'\times \hat {x}^{'}_{0}(0))\) and set
$${} \begin{aligned} \hat{u}&=(\hat{p}_{0},\hat{\bar{x}},\hat{q},\hat{\hat{x}}_{0},\hat{p},\hat{k},\hat{\hat{z}}_{0},\hat{\bar{\theta}},\hat{\theta}_{0})\\ &=(p_{0}-p^{'}_{0},\bar{x}-\bar{x}^{'},q-q^{'},\hat{x}_{0}-\hat{x}^{'}_{0},p-p',k-k',\hat{z}_{0}-\hat{z}'_{0},\bar{\theta}-\bar{\theta}^{'},\theta_{0}-\theta^{'}_{0})\\ \hat{U}&=(\hat{P}_{0},\hat{\bar{X}},\hat{Q},\hat{\hat{X}}_{0},\hat{P},\hat{K},\hat{\hat{Z}}_{0},\hat{\bar{\Theta}},\hat{\Theta}_{0})\\ &=(P_{0}-P^{'}_{0},\bar{X}-\bar{X}',Q-Q',\hat{X}_{0}-\hat{X}^{'}_{0},P-P',K-K',\hat{Z}_{0}-\hat{Z}^{'}_{0},\bar{\Theta}\\&\quad-\bar{\Theta}^{'},\Theta_{0}-\Theta^{'}_{0}). \end{aligned} $$
Applying Itô’s formula to \(\langle \hat {P}_{0},\hat {X}_{0}\rangle +\langle \hat {\bar {X}},\hat {P}\rangle +\langle \hat {Q},\hat {K}\rangle \), we have
$$ \begin{aligned} &\left(\gamma_{0} H_{0}+(1-\gamma_{0})\right)\mathbb{E}\Big|\hat{\hat{X}}_{0}(0)\Big|^{2}+\mathbb{E}{\int_{0}^{T}}\left(\beta_{1}\big|\hat{P}_{0}(s)\big|^{2}+\beta_{2}\big|\hat{\hat{X}}_{0}(s)\big|^{2}\right)ds\\ \leq&\delta C_{1} \mathbb{E}{\int_{0}^{T}}\left(|\hat{u}_{s}|^{2}+|\hat{U}_{s}|^{2}\right)ds+\delta C_{1} \mathbb{E}\Big|\hat{\hat{x}}_{0}(0)\Big|^{2}. \end{aligned} $$
On the other hand, since P 0 and \(P^{\prime }_{0}\) are solutions of SDEs with Itô’s type, applying the usual technique, the estimate for the difference \(\hat {P}_{0}=P_{0}-P'_{0}\) is obtained by
$$ \begin{aligned} &\mathbb{E}{\int_{0}^{T}}|\hat{P}_{0}(s)|^{2}ds\leq C_{1} T\delta \mathbb{E}{\int_{0}^{T}} |\hat{u}_{s}|^{2}ds+C_{1}T \mathbb{E}\Big|\hat{\hat{X}}_{0}(0)\Big|^{2}+C_{1}T\delta \mathbb{E}\Big|\hat{\hat{x}}_{0}(0)\Big|^{2}\\ &\qquad\qquad\qquad\qquad+C_{1} T\mathbb{E}{\int_{0}^{T}} \left(|\hat{X}_{0}(s)|^{2}+|\hat{\bar{X}}(s)|^{2}+|\hat{P}(s)|^{2}+|\hat{Q}(s)|^{2}\right)ds. \end{aligned} $$
Similarly, estimates for the difference \(\hat {\bar {X}}=\bar {X}-\bar {X}'\) and \(\hat {Q}=Q-Q'\) are given by
$$ \begin{aligned} &\sup_{0\leq s\leq r}\mathbb{E}\big|\hat{\bar{X}}(s)\big|^{2}\leq C_{1} \delta \mathbb{E}{\int_{0}^{r}} |\hat{u}_{s}|^{2}ds+C_{1} \mathbb{E}{\int_{0}^{r}} \left(|\hat{K}(s)|^{2}+|\hat{\hat{X}}_{0}(s)|^{2}\right)ds \end{aligned} $$
$$ \begin{aligned} &\sup_{0\leq s\leq r}\mathbb{E}\big|\hat{Q}(s)\big|^{2}\leq C_{1} \delta \mathbb{E}{\int_{0}^{r}} |\hat{u}_{s}|^{2}ds+C_{1} \mathbb{E}{\int_{0}^{r}} \left(|\hat{K}(s)|^{2}+|\hat{P}(s)|^{2}\right)ds, \end{aligned} $$
respectively, for 0≤rT. In the same way, for the difference of the solutions \((\hat {\hat {X}}_{0},\hat {\hat {Z}}_{0})=(\hat {X}_{0}-\hat {X}^{'}_{0},\hat {Z}_{0}-\hat {Z}^{'}_{0}), (\hat {P},\hat {\bar {\Theta }})=(P-P',\bar {\Theta }-\bar {\Theta }')\) and \((\hat {K},\hat {\Theta }_{0})=(K-K',\Theta _{0}-\Theta ^{'}_{0})\), applying the usual technique to the BSDEs, we have
$$ \begin{aligned} \mathbb{E}{\int_{0}^{T}}\left(|\hat{\hat{X}}_{0}(s)|^{2}+|\hat{\hat{Z}}_{0}(s)|^{2}\right)ds\leq C_{1} \delta \mathbb{E}{\int_{0}^{T}} |\hat{u}_{s}|^{2}ds+C_{1} \mathbb{E}{\int_{0}^{T}} |\hat{P}_{0}(s)|^{2}ds, \end{aligned} $$
$$ \begin{aligned} \mathbb{E}{\int_{0}^{r}}\left(|\hat{P}(s)|^{2}+|\hat{\bar{\Theta}}(s)|^{2}\right)ds\leq &C_{1} \delta \mathbb{E}{\int_{0}^{r}} |\hat{u}_{s}|^{2}ds\\ &\qquad+C_{1} \mathbb{E}{\int_{0}^{r}} \left(|\hat{X}_{0}(s)|^{2}+|\hat{\bar{X}}(s)|^{2}+|\hat{Q}(s)|^{2}\right)ds \end{aligned} $$
$${} \begin{aligned} \mathbb{E}{\int_{0}^{r}}\!\left(|\hat{K}(s)|^{2}+|\hat{\Theta}_{0}(s)|^{2}\right)ds\leq \!C_{1} \delta \mathbb{E}{\int_{0}^{r}} |\hat{u}_{s}|^{2}ds+C_{1} \mathbb{E}{\int_{0}^{r}} \!\left(|\hat{X}_{0}(s)|^{2}+|\hat{\bar{X}}(s)|^{2}\right)\!ds \end{aligned} $$

for 0≤rT. Here the constant C 1 depends on the coefficients of (1)–(2), P(·), β 1, β 2, and \(\mathcal {T}\). γ 0 H 0+(1−γ 0)≥μ, μ= min(1,H 0)>0.

Under (H2), combining (25), (27)–(28), (30)–(31), and applying Gronwall’s inequality, we obtain
$$\mathbb{E}{\int_{0}^{T}}|\hat{U}_{s}|^{2}ds+\mathbb{E}\Big|\hat{\hat{X}}_{0}(0)\Big|^{2}\leq C_{2} \delta \left(\mathbb{E}{\int_{0}^{T}}|\hat{u}_{s}|^{2}ds+\mathbb{E}\big|\hat{\hat{x}}_{0}(0)\big|^{2}\right), $$
where C 2 depends on C 1, μ, and T. Choosing \(\delta _{0}=\frac {1}{2C_{2}}\), we get that for each fixed δ[0,δ 0], the mapping \(I_{\gamma _{0}+\delta }\) is a contraction in the sense that
$$\mathbb{E}{\int_{0}^{T}}|\hat{U}_{s}|^{2}ds+\mathbb{E}\Big|\hat{\hat{X}}_{0}(0)\Big|^{2}\leq \frac{1}{2} \left(\mathbb{E}{\int_{0}^{T}}|\hat{u}_{s}|^{2}ds+\mathbb{E}\big|\hat{\hat{x}}_{0}(0)\big|^{2}\right). $$
Then it follows that there exists a unique fixed point
$$U^{\gamma_{0}+\delta}=\left(P_{0}^{\gamma_{0}+\delta},\bar{X}^{\gamma_{0}+\delta},Q^{\gamma_{0}+\delta},\hat{X}_{0}^{\gamma_{0}+\delta}, P^{\gamma_{0}+\delta},K^{\gamma_{0}+\delta},\hat{Z}_{0}^{\gamma_{0}+\delta},\bar{\Theta}^{\gamma_{0}+\delta},\Theta_{0}^{\gamma_{0}+\delta}\right),$$
which is the solution of (23) for γ=γ 0+δ. Since δ 0 depends only on (C 1,μ,T), we can repeat this process N times with 1≤N δ 0<1+δ 0.

Then it follows that, in particular, as γ=1 corresponding to \({\varphi ^{i}_{t}}\equiv 0,\lambda _{t}\equiv 0,{\kappa ^{i}_{t}}\equiv 0,a=0\ (i=1,2,3)\), (23) admits a unique solution, which implies the wellposedness of (17) (also (11)). The proof is complete. □

Remark 3.5

In what follows, (17) is called the Nash certainty equivalence (NCE) equation system (see (Huang 2010;Huang et al. 2007;2012;Huang et al. 2006)). By Theorem 3.1, we know that there exists a unique 9-tuple solution \((p_{0},\bar {x},q,\hat {x}_{0},p,k,\hat {z}_{0},\bar {\theta },\theta _{0})\) which can be obtained off-line. Thus, it is equivalent with the fixed-point principle. To the best of our knowledge, this is the first paper to focus on the well-posedness of coupled FBSDE in large population problems.

ε-Nash equilibrium analysis

In above sections, we obtained the optimal control \(\bar {u}_{i}(\cdot), 0\le i\le N\) of Problem (II) through the consistency condition system. Now, we turn to verify the ε-Nash equilibrium of Problem (I). To start, we first present the definition of ε-Nash equilibrium.

Definition 4.1

A set of controls \(u_{k}\in \mathcal {U}_{k},\ 0\leq k\leq N,\) for (N+1) agents is called to satisfy an ε-Nash equilibrium with respect to the costs J k , 0≤kN, if there exists ε≥0 such that for any fixed 0≤iN, we have
$$ J_{i}(u_{i},u_{-i})\leq J_{i}(u'_{i},u_{-i})+\epsilon $$

when any alternative control \(u^{\prime }_{i}\in \mathcal {U}_{i}\) is applied by \(\mathcal {A}_{i}\).

If ε=0, then Definition 4.1 is reduced to the usual Nash equilibrium. Now, we state the main result of this paper and its proof will be given later.

Theorem 4.1

Under (H1)(H2), \((\tilde {u}_{0},\tilde {u}_{1},\tilde {u}_{2},\cdots,\tilde {u}_{N})\) satisfies the ε-Nash equilibrium of (I). Here, \(\tilde {u}_{0}\) is given by
$$ \tilde{u}_{0}(t)=-B_{0}R_{0}^{-1}p_{0}(t), $$
where p 0(·)is obtained off-line by (17); while for \(1\le i\le N, \tilde {u}_{i}\) is
$$ \tilde{u}_{i}(t)=-BR^{-1}P(t)\tilde{x}_{i}(t)-BR^{-1}k(t), $$

where \(\tilde {x}_{i}(\cdot)\), the state trajectory for \(\mathcal {A}_{i}\), satisfies (21).

The proof of above theorem needs several lemmas which are presented later. Denote by \((\tilde {x}_{0}(\cdot),\tilde {z}_{0}(\cdot))\) the centralized state trajectory; \((\hat {x}_{0}(\cdot),\hat {z}_{0}(\cdot))\) the decentralized one. Applying \(\tilde {u}_{0}(\cdot)\) to \(\mathcal {A}_{0}\) and using the notations above, it is easy to know that \((\tilde {x}_{0}(\cdot),\tilde {z}_{0}(\cdot))\equiv (\hat {x}_{0}(\cdot),\hat {z}_{0}(\cdot))\). Further, \((\bar {x}(\cdot),k(\cdot))_{\tilde {x}_{0}}=(\bar {x}(\cdot),k(\cdot))_{\hat {x}_{0}}\). Hereafter, for any \(h_{j}(\cdot)\in L^{2}_{\mathcal {F}}(0,T;\mathbb {R}),j=1,2,3\); denote by \((h_{1}(\cdot),h_{2}(\cdot))_{h_{3}}\phantom {\dot {i}\!}\) the stochastic process pair (h 1(·),h 2(·)) which is determined by h 3(·). The cost functionals for (I) and (II) are given by
$$ \begin{aligned} J_{0}(\tilde{u}_{0}(\cdot), \tilde{u}_{-0}(\cdot))=&\frac{1}{2} \mathbb{E}\left\{ {\int_{0}^{T}}\left[Q_{0}\left(\tilde{x}_{0}(t)-\tilde{x}^{(N)}(t)\right)^{2}+\tilde{Q}\tilde{x}^{2}_{0}(t)+R_{0}\tilde{u}_{0}^{2}(t)\right]\right.\\&\left.dt+H_{0}\tilde{x}_{0}^{2}(0) {\vphantom{{\int_{0}^{T}}\left[Q_{0}\left(\tilde{x}_{0}(t)-\tilde{x}^{(N)}(t)\right)^{2}+\tilde{Q}\tilde{x}^{2}_{0}(t)+R_{0}\tilde{u}_{0}^{2}(t)\right]}}\right\} \end{aligned} $$
$$ \begin{aligned} \bar{J}_{0}(\bar{u}_{0}(\cdot))=&\frac{1}{2} \mathbb{E}\left\{ {\int_{0}^{T}}\left[Q_{0}\left(\hat{x}_{0}(t)-\bar{x}(t)_{\hat{x}_{0}}\right)^{2}+\tilde{Q}\hat{x}^{2}_{0}(t)+R_{0}\bar{u}_{0}^{2}(t)\right]dt+H_{0}\hat{x}_{0}^{2}(0) \right\}, \end{aligned} $$
respectively. For \(\mathcal {A}_{i},1\leq i\leq N\), we have the following closed-loop system
$$ \left\{ \begin{aligned} d\tilde{x}_{i}(t)=& \left[(A-B^{2}R^{-1}P(t))\tilde{x}_{i}(t)-B^{2}R^{-1}k(t)_{\tilde{x}_{0}}+D\tilde{x}^{(N)}(t)+\alpha \tilde{x}_{0}(t)\right]\\&dt+\sigma {dW}_{i}(t),\\ \tilde{x}_{i}(0)=& x_{i0} \end{aligned} \right. $$
with the cost functional
$$ \begin{aligned} J_{i}(\tilde{u}_{i}(\cdot), \tilde{u}_{-i}(\cdot))=&\frac{1}{2} \mathbb{E}\left\{ {\int_{0}^{T}}\left[Q\left(\tilde{x}_{i}(t)-\tilde{x}^{(N)}(t)\right)^{2}+R\tilde{u}_{i}^{2}(t)\right]dt+H\tilde{x}_{i}^{2}(T) \right\}, \end{aligned} $$
where \(\tilde {x}^{(N)}(t)=\frac {1}{N}\sum \limits ^{N}_{i=1}\tilde {x}_{i}(t)\). The auxiliary system (limiting problem) is given by
$$ \left\{ \begin{aligned} d\hat{x}_{i}(t)=& \left[(A-B^{2}R^{-1}P(t))\hat{x}_{i}(t)-B^{2}R^{-1}k(t)_{\hat{x}_{0}}+D\bar{x}(t)_{\hat{x}_{0}}+\alpha \hat{x}_{0}(t)\right]\\ &dt+\sigma {dW}_{i}(t),\\ \hat{x}_{i}(0)=& x_{i0} \end{aligned} \right. $$
with the cost functional
$$ \begin{aligned} \bar{J}_{i}(\bar{u}_{i}(\cdot))=&\frac{1}{2} \mathbb{E}\left\{ {\int_{0}^{T}}\left[Q\left(\hat{x}_{i}(t)-\bar{x}(t)_{\hat{x}_{0}}\right)^{2}+R\bar{u}_{i}^{2}(t)\right]dt+H\hat{x}_{i}^{2}(T) \right\}, \end{aligned} $$

where \((\bar {x}(t)_{\hat {x}_{0}},k(t)_{\hat {x}_{0}})\) satisfies (17). We have

Lemma 4.1

$$\begin{array}{@{}rcl@{}} \sup_{0\leq t\leq T} \mathbb{E}\Big|\tilde{x}^{(N)}(t)-\bar{x}(t)_{\hat{x}_{0}}\Big|^{2}=O\left(\frac{1}{N}\right), \end{array} $$
$$\begin{array}{@{}rcl@{}} \Big|J_{0}(\tilde{u}_{0},\tilde{u}_{-0})-\bar{J}_{0}(\bar{u}_{0})\Big|=O\left(\frac{1}{\sqrt{N}}\right). \end{array} $$


By (37), we have
$$\left\{ \begin{aligned} d\tilde{x}^{(N)}(t)=&\left[\left(A+D-B^{2}R^{-1}P(t)\right)\tilde{x}^{(N)}(t)-B^{2}R^{-1}k(t)_{\tilde{x}_{0}}+\alpha \tilde{x}_{0}(t)\right]\\&dt+\frac{1}{N}\sum\limits_{i=1}^{N}\sigma {dW}_{i}(t),\\ \tilde{x}^{(N)}(0)=&x^{(N)}_{0}, \end{aligned} \right. $$
where \( x^{(N)}_{0}:=\frac {1}{N}\sum \limits _{i=1}^{N}x_{i0}\). Noting that
$$\begin{aligned} \mathbb{E}\Big|x^{(N)}_{0}-x\Big|^{2}\sim\mathbb{E}\left|{\int_{0}^{t}}\frac{1}{N}\sum\limits_{i=1}^{N}\sigma {dW}_{i}(s)\right|^{2}=O\left(\frac{1}{N}\right), \end{aligned} $$
by (17) and Gronwall’s inequality, we obtain (41).
It is easy to get that \(\sup \limits _{0\leq t\leq T}\mathbb {E}\big |\hat {x}_{0}(t)-\bar {x}(t)_{\hat {x}_{0}}\big |^{2}<+\infty \). Applying the Cauchy–Schwarz inequality, we have
$${} \begin{aligned} &\sup_{0\leq t\leq T}\mathbb{E}\Big|\big|\tilde{x}_{0}(t)-\tilde{x}^{(N)}(t)\big|^{2}-\big|\hat{x}_{0}(t)-\bar{x}(t)_{\hat{x}_{0}}\big|^{2}\Big|\\ \leq& \sup_{0\leq t\leq T}\mathbb{E}\big|\tilde{x}_{0}(t)-\tilde{x}^{(N)}(t)-\hat{x}_{0}(t)+\bar{x}(t)_{\hat{x}_{0}}\big|^{2}\\ &+2\sup_{0\leq t\leq T}\mathbb{E}\left[\big|\hat{x}_{0}(t)-\bar{x}(t)_{\hat{x}_{0}}\big|\big|\tilde{x}_{0}(t)-\tilde{x}^{(N)}(t)-\hat{x}_{0}(t)+\bar{x}(t)_{\hat{x}_{0}}\big|\right]\\ \leq&\sup_{0\leq t\leq T}\mathbb{E}\big|\tilde{x}_{0}(t)-\hat{x}_{0}(t)-\left(\tilde{x}^{(N)}(t)-\bar{x}(t)_{\hat{x}_{0}}\right)\big|^{2}\\ &\!\!+2\!\left(\sup_{0\leq t\leq T}\mathbb{E}\big|\hat{x}_{0}(t)-\bar{x}(t)_{\hat{x}_{0}}\big|^{2}\right)^{\!\frac{1}{2}}\!\!\left(\sup_{0\leq t\leq T}\mathbb{E}\big|\tilde{x}_{0}(t)\,-\,\hat{x}_{0}(t)\,-\,\big(\tilde{x}^{(N)}(t)-\bar{x}(t)_{\hat{x}_{0}}\big)\big|^{2}\right)^{\frac{1}{2}}\\ =&O\left(\frac{1}{\sqrt{N}}\right). \end{aligned} $$

In addition, by (10) and (33), we have \(\tilde {u}_{0}(\cdot)=\hat {u}_{0}(\cdot).\) Thus, (42) is obtained. □

For minor agents, we have

Lemma 4.2

$$\begin{array}{@{}rcl@{}} &\sup\limits_{1\leq i\leq N}\left[\sup\limits_{0\leq t\leq T}\mathbb{E}\Big|\tilde{x}_{i}(t)-\hat{x}_{i}(t)\Big|^{2}\right]=O\left(\frac{1}{N}\right), \end{array} $$
$$\begin{array}{@{}rcl@{}} &\sup\limits_{1\leq i\leq N}\left[\sup\limits_{0\leq t\leq T}\mathbb{E}\Big|\tilde{u}_{i}(t)-\bar{u}_{i}(t)\Big|^{2}\right]=O\left(\frac{1}{N}\right), \end{array} $$
$$\begin{array}{@{}rcl@{}} &\Big|J_{i}(\tilde{u}_{i},\tilde{u}_{-i})-\bar{J}_{i}(\bar{u}_{i})\Big|=O\left(\frac{1}{\sqrt{N}}\right),\ \ 1\leq i\leq N. \end{array} $$


For 1≤iN, applying Gronwall’s inequality, we get (44) from (41), (37) and (39). (45) follows from (44) and (34), obviously. Using the same technique as (43) and noting \(\sup \limits _{0\leq t\leq T}\mathbb {E}\big |\hat {x}_{i}(t)-\bar {x}(t)_{\hat {x}_{0}}\big |^{2}<+\infty,\sup \limits _{0\leq t\leq T}\mathbb {E}\big |\bar {u}_{i}(t)\big |^{2}<+\infty,\sup \limits _{0\leq t\leq T}\mathbb {E}\big |\hat {x}_{i}(t)\big |^{2}<+\infty \), we obtain (46). □

Until now, we have studied some estimates of states and costs corresponding to control \(\tilde {u}_{i}\) and \(\bar {u}_{i}, 0\le i\le N\). Next, we will focus on the ε-Nash equilibrium for (I). Consider a perturbed control \(u_{0} \in \mathcal {U}_{0}\) for \(\mathcal {A}_{0}\) and introduce the dynamics
$$ \left\{ \begin{aligned} {dl}_{0}(t)=& \left[A_{0}l_{0}(t)+B_{0}u_{0}(t)+C_{0}q_{0}(t)\right]dt+q_{0}(t){dW}_{0}(t),\\ x_{0}(T)=& \xi, \end{aligned} \right. $$
whereas minor players keep the control \(\tilde {u}_{i},1\leq i\leq N,\) i.e.,
$$ \left\{ \begin{aligned} {dl}_{i}(t)=& \left[(A-B^{2}R^{-1}P(t))l_{i}(t)-B^{2}R^{-1}k(t)_{l_{0}}+Dl^{(N)}(t)+\alpha l_{0}(t)\right]\\&dt+\sigma {dW}_{i}(t),\\ l_{i}(0)=& x_{i0}, \end{aligned} \right. $$
where \(l^{(N)}(t)=\frac {1}{N}\sum \limits _{k=1}^{N}l_{k}(t)\); \(k(t)_{l_{0}}\) associated with l 0 satisfies
$$ \left\{ \begin{aligned} &dk(t)_{l_{0}}=\left[\left(-A+B^{2}R^{-1}P(t)\right)k(t)_{l_{0}}+\left(Q-DP(t)\right)\bar{x}(t)_{l_{0}}-\alpha P(t)l_{0}(t)\right]dt\\ &\qquad\qquad\quad+\theta_{0}(t)_{l_{0}}{dW}_{0}(t),\\ &d\bar{x}(t)_{l_{0}}=\left[\big(A+D-B^{2}R^{-1}P(t)\big)\bar{x}(t)_{l_{0}}-B^{2}R^{-1}k(t)_{l_{0}}+\alpha l_{0}(t)\right]dt,\\ &k(T)_{l_{0}}=0,\ \bar{x}(0)_{l_{0}}=x. \end{aligned} \right. $$
And for any fixed i, 1≤iN, consider a perturbed control \(u_{i} \in \mathcal {U}_{i}\) for \(\mathcal {A}_{i}\), whereas the major and other minor players keep the control \(\tilde {u}_{j},0\leq j\leq N,j\neq i.\) Introduce the dynamics
$$ \left\{ \begin{aligned} {dm}_{i}(t)=& \left[{Am}_{i}(t)+{Bu}_{i}(t)+Dm^{(N)}(t)+\alpha \tilde{x}_{0}(t)\right]dt+\sigma {dW}_{i}(t),\\ m_{i}(0)=& x_{i0} \end{aligned} \right. $$
and for 1≤jN, ji,
$$ \left\{ \begin{aligned} {dm}_{j}(t)=& \left[(A-B^{2}R^{-1}P(t))m_{j}(t)-B^{2}R^{-1}k(t)_{\tilde{x}_{0}}+Dm^{(N)}(t)+\alpha \tilde{x}_{0}(t)\right]\\&dt+\sigma {dW}_{j}(t),\\ m_{j}(0)=& x_{j0}, \end{aligned} \right. $$

where \(m^{(N)}(t)=\frac {1}{N}\sum \limits _{k=1}^{N}m_{k}(t)\); \(k(t)_{\tilde {x}_{0}}\) satisfies (17) due to \(\tilde {x}_{0}(\cdot)=\hat {x}_{0}(\cdot)\).

If \(\tilde {u}_{j},\ 0\leq j\leq N\) is an ε-Nash equilibrium with respect to cost J j , it holds that
$$J_{j}(\tilde{u}_{j},\tilde{u}_{-j})\geq \inf_{u_{j}\in \mathcal{U}_{j}}J_{j}(u_{j},\tilde{u}_{-j})\geq J_{j}(\tilde{u}_{j},\tilde{u}_{-j})-\epsilon. $$
Then, when making the perturbation, we just need to consider \(u_{j}\in \mathcal {U}_{j}\) such that \(J_{j}(u_{j},\tilde {u}_{-j})\leq J_{j}(\tilde {u}_{j},\tilde {u}_{-j}),\) which implies
$$\begin{aligned} \frac{1}{2}\mathbb{E}{\int_{0}^{T}}R{u_{j}^{2}}(t)dt\leq J_{j}(u_{j},\tilde{u}_{-j})\leq J_{j}(\tilde{u}_{j},\tilde{u}_{-j})=\bar{J}_{j}(\bar{u}_{j})+O\left(\frac{1}{\sqrt{N}}\right). \end{aligned} $$
In the limiting cost functional \(\bar {J}_{j}\), by the optimality of \((\bar {x}_{j},\bar {u}_{j})\), we get that \((\bar {x}_{j},\bar {u}_{j})\) is L 2-bounded. Then we obtain the boundedness of \(\bar {J}_{j}(\bar {u}_{j})\), i.e.,
$$ \mathbb{E}{\int_{0}^{T}}{u_{j}^{2}}(t)dt\leq C_{3},~ 0\leq j\leq N, $$

where C 3 is a positive constant and independent of N. Then we have the following proposition.

Proposition 4.1

\(\sup \limits _{0\leq t\leq T}\!\mathbb {E}\big |l_{0}(t)\big |^{2}\), \(\sup \limits _{1\le k\le N}\!\!\left [\sup \limits _{0\leq t\leq T}\mathbb {E}\big |l_{k}(t)\big |^{2} \right ], \sup \limits _{1\le k\le N}\!\!\left [\sup \limits _{0\leq t\leq T}\mathbb {E}\big |m_{k}(t)\big |^{2} \right ]\) are bounded.


By (52), applying the usual technique of BSDE, we get the boundedness of \(\sup \limits _{0\leq t\leq T}\mathbb {E}\big |l_{0}(t)\big |^{2}\). It follows from (48) that
$$\begin{array}{@{}rcl@{}} \mathbb{E}\left[\sum\limits_{k=1}^{N}|l_{k}(t)|^{2}\right]\leq & C_{4}\left\{\mathbb{E}\left[\sum\limits_{k=1}^{N}|x_{k0}|^{2}\right]+\mathbb{E}{\int_{0}^{t}}\left[\sum\limits_{k=1}^{N}|l_{k}(s)|^{2}+N|k(s)_{l_{0}}|^{2}+N|l_{0}(s)|^{2}\right]ds\right.\\ &\left.+\sum\limits_{k=1}^{N}\mathbb{E}\Big|{\int_{0}^{t}}\sigma {dW}_{k}(s)\Big|^{2}\right\}. \end{array} $$
From (50) and (51), it holds that
$$\begin{aligned} \mathbb{E}\left[\sum\limits_{k=1}^{N}|m_{k}(t)|^{2}\right]\leq &C_{5}\left\{\mathbb{E}\left[\sum\limits_{k=1}^{N}|x_{k0}|^{2}\right]+\mathbb{E}{\int_{0}^{t}}\left[\sum\limits_{k=1}^{N}|m_{k}(s)|^{2}+|u_{i}(s)|^{2}+\sum\limits_{k=1,k\neq i}^{N}|\tilde{u}_{k}(s)|^{2}\right.\right.\\ &\left.\left.+N|\tilde{x}_{0}(s)|^{2}\right]ds+\sum\limits_{k=1}^{N}\mathbb{E}\Big|{\int_{0}^{t}}\sigma {dW}_{k}(s)\Big|^{2}\right\}. \end{aligned} $$
Here, C 4 and C 5 are both positive constants. Since \(\sup \limits _{0\leq t\leq T}\mathbb {E}\big |l_{0}(t)\big |^{2}\) is bounded, we get the boundedness of \(\sup \limits _{0\leq t\leq T}\mathbb {E}\big |k(t)_{l_{0}}\big |^{2}\) by (49). It follows from (52) that \(\mathbb {E}|u_{i}(\cdot)|^{2}\) is bounded. Besides, the optimal controls \(\tilde {u}_{k}(\cdot),k\neq i\) is L 2-bounded. Then by Gronwall’s inequality, it follows that
$$\sup_{0\leq t\leq T}\mathbb{E}\left[\sum\limits_{k=1}^{N}|l_{k}(t)|^{2}\right]\sim\sup_{0\leq t\leq T}\mathbb{E}\left[\sum\limits_{k=1}^{N}|m_{k}(t)|^{2}\right]=O(N). $$

Thus, for any \(1\leq k\leq N, \sup \limits _{0\leq t\leq T}\mathbb {E}|l_{k}(t)|^{2}\) and \(\sup \limits _{0\leq t\leq T}\mathbb {E}|m_{k}(t)|^{2}\) are bounded. Hence the result. □

Correspondingly, the dynamics for agent \(\mathcal {A}_{0}\) under control u 0 for (II) is as follows
$$ \left\{ \begin{aligned} dl'_{0}(t)=& \left[A_{0}l'_{0}(t)+B_{0}u_{0}(t)+C_{0}q'_{0}(t)\right]dt+q'_{0}(t){dW}_{0}(t),\\ x'_{0}(T)=& \xi \end{aligned} \right. $$
and for agent \(\mathcal {A}_{i},1\leq i\leq N\),
$$ \left\{ \begin{aligned} d\hat{l}_{i}(t)=& \left[(A-B^{2}R^{-1}P(t))\hat{l}_{i}(t)-B^{2}R^{-1}k(t)_{l'_{0}}+D\bar{x}(t)_{l'_{0}}+\alpha l'_{0}(t)\right]\\&dt+\sigma {dW}_{i}(t),\\ \hat{l}_{i}(0)=& x_{i0}, \end{aligned} \right. $$
where \((k(t)_{l'_{0}},\bar {x}(t)_{l'_{0}})\) associated with \(l^{\prime }_{0}\) satisfy
$$ \left\{ \begin{aligned} &dk(t)_{l'_{0}}=\left[\left(-A+B^{2}R^{-1}P(t)\right)k(t)_{l'_{0}}+\left(Q-DP(t)\right)\bar{x}(t)_{l'_{0}}-\alpha P(t)l'_{0}(t)\right]dt\\ &\qquad\qquad\quad+\theta_{0}(t)_{l'_{0}}{dW}_{0}(t),\\ &d\bar{x}(t)_{l'_{0}}=\left[\left(A+D-B^{2}R^{-1}P(t)\right)\bar{x}(t)_{l'_{0}}-B^{2}R^{-1}k(t)_{l'_{0}}+\alpha l'_{0}(t)\right]dt,\\ &k(T)_{l'_{0}}=0,\ \bar{x}(0)_{l'_{0}}=x. \end{aligned} \right. $$

Then we have

Lemma 4.3

$$\begin{array}{@{}rcl@{}} \sup_{0\leq t\leq T}\mathbb{E}\Big|l^{(N)}(t)-\bar{x}(t)_{l'_{0}}\Big|^{2}=O\left(\frac{1}{N}\right), \end{array} $$
$$\begin{array}{@{}rcl@{}} \Big|J_{0}(u_{0},\tilde{u}_{-0})-\bar{J}_{0}(u_{0})\Big|=O\left(\frac{1}{\sqrt{N}}\right). \end{array} $$


From (47) and (53), by the existence and uniqueness of BSDE, for the same perturbed control u 0(·), we have \((l^{'}_{0},q'_{0})=(l_{0},q_{0})\). Further, noting FBSDE (49) and (55), we get \((k(t)_{l^{'}_{0}},\bar {x}(t)_{l'_{0}})=(k(t)_{l_{0}},\bar {x}(t)_{l_{0}})\).

It follows from (48) that
$$\left\{ \begin{aligned} dl^{(N)}(t)=&\left[\left(A+D-B^{2}R^{-1}P(t)\right)l^{(N)}(t)-B^{2}R^{-1}k(t)_{l_{0}}+\alpha l_{0}(t)\right]\\&dt+\frac{1}{N}\sum\limits_{i=1}^{N}\sigma {dW}_{i}(t),\\ l^{(N)}(0)=& x^{(N)}_{0}. \end{aligned}\right. $$
Noting (55) and
$$ \begin{aligned} \mathbb{E}\Big|x^{(N)}_{0}-x_{0}\Big|^{2}\sim\mathbb{E}\left|{\int_{0}^{t}}\frac{1}{N}\sum\limits_{i=1}^{N}\sigma {dW}_{i}(s)\right|^{2}=O\left(\frac{1}{N}\right), \end{aligned} $$

and applying Gronwall’s inequality, we get (56). Using the same technique as Lemma 4.1 and noting \(\sup \limits _{0\leq t\leq T}\mathbb {E}\big |l^{'}_{0}(t)-\bar {x}(t)_{l^{'}_{0}}\big |^{2}<+\infty \), we obtain (57). □

Now, we will focus on the difference of states and cost functionals for the perturbed control and optimal control of minor agents. Given the system of \(\mathcal {A}_{i}\) under control u i for (II)
$$ \left\{ \begin{aligned} dm'_{i}(t)=& \left[Am'_{i}(t)+{Bu}_{i}(t)+D\bar{x}(t)_{\hat{x}_{0}}+\alpha \hat{x}_{0}(t)\right]dt+\sigma {dW}_{i}(t),\\ m'_{i}(0)=& x_{i0} \end{aligned} \right. $$
and for agent \(\mathcal {A}_{j}, 1\leq j\leq N, j\neq i\),
$$ \left\{ \begin{aligned} d\hat{m}_{j}(t)=& \left[(A-B^{2}R^{-1}P(t))\hat{m}_{j}(t)-B^{2}R^{-1}k(t)_{\hat{x}_{0}}+D\bar{x}(t)_{\hat{x}_{0}}+\alpha \hat{x}_{0}(t)\right]\\&dt+\sigma {dW}_{j}(t),\\ \hat{m}_{j}(0)=& x_{j0}, \end{aligned} \right. $$

where \((\bar {x}(t)_{\hat {x}_{0}},k(t)_{\hat {x}_{0}})\) satisfies (17).

In order to give necessary estimates in (I) and (II), we need to introduce some intermediate states as
$$\left\{ \begin{aligned} d\check{m}_{i}(t)=&\left[A\check{m}_{i}(t)+{Bu}_{i}(t)+\frac{N-1}{N}D \check{m}^{(N-1)}(t)+\alpha \tilde{x}_{0}(t)\right]dt +\sigma {dW}_{i}(t),\\ \check{m}_{i}(0)=&x_{i0} \end{aligned}\right. $$
and for 1≤jN, ji,
$${}\left\{ \begin{aligned} d\check{m}_{j}(t)=&\left[\left(A-B^{2}R^{-1}P(t)\right)\check{m}_{j}(t)-B^{2}R^{-1}k(t)_{\tilde{x}_{0}}+\frac{N-1}{N}D \check{m}^{(N-1)}(t)+\alpha \tilde{x}_{0}(t)\right]dt\\ &+\sigma {dW}_{j}(t),\\ \check{m}_{j}(0)=&x_{j0}, \end{aligned}\right. $$

where \(\check {m}^{(N-1)}(t)=\frac {1}{N-1}\sum \limits _{j=1,j\neq i}^{N}\check {m}_{j}(t)\).

Define \(m^{(N-1)}(t):=\frac {1}{N-1}\sum \limits _{j=1,j\neq i}^{N}m_{j}(t)\), \(x^{(N-1)}_{0}:=\frac {1}{N-1}\sum \limits _{j=1,j\neq i}^{N}x_{j0}\). By (51) and (61), we get
$${}\left\{ \begin{aligned} \!dm^{(N-1)}(t)=&\left[\!\left(\!A-B^{2}R^{-1}P(t)\,+\,\frac{N-1}{N}D\!\right)m^{(N-1)}(t)-\!B^{2}R^{-1}k(t)_{\tilde{x}_{0}}+\alpha \tilde{x}_{0}(t)\right.\\ &\qquad\left.+\frac{D}{N}m_{i}(t)\right]dt+\frac{1}{N-1}\sum\limits_{j=1,j\neq i}^{N}\sigma {dW}_{j}(t),\\ m^{(N-1)}(0)=&x^{(N-1)}_{0} \end{aligned}\right. $$
$$ \left\{ \begin{aligned} \!d\check{m}^{(N-1)}(t)\!=&\left[\!\left(\!A-\!B^{2}R^{-1}P(t)+\frac{N-1}{N}D\!\right)\check{m}^{(N-1)}(t)\,-\,B^{2}R^{-1}k(t)_{\tilde{x}_{0}}\!+\alpha \tilde{x}_{0}(t)\!\right]dt\\ &+\frac{1}{N-1}\sum\limits_{j=1,j\neq i}^{N}\sigma {dW}_{j}(t),\\ \check{m}^{(N-1)}(0)=&x^{(N-1)}_{0}. \end{aligned}\right. $$

Then we have the following proposition.

Proposition 4.2

$$\begin{array}{@{}rcl@{}} \sup_{0\leq t\leq T} & \mathbb{E}\Big|m^{(N-1)}(t)-\check{m}^{(N-1)}(t)\Big|^{2}=O\left(\frac{1}{N^{2}}\right), \end{array} $$
$$\begin{array}{@{}rcl@{}} \sup_{0\leq t\leq T} & \!\!\!\!\!\!\!\! \mathbb{E}\Big|m^{(N)}(t)-m^{(N-1)}(t)\Big|^{2}=O\left(\frac{1}{N}\right), \end{array} $$
$$\begin{array}{@{}rcl@{}} \sup_{0\leq t\leq T} & \!\!\!\!\!\!\!\!\!\!\!\!\! \mathbb{E}\Big|\check{m}^{(N-1)}(t)-\bar{x}(t)_{\hat{x}_{0}}\Big|^{2}=O\left(\frac{1}{N}\right). \end{array} $$


From (62)–(63), applying Proposition 4.1 and Gronwall’s inequality, the assertion (64) holds. (65) follows from (H1) and the L 2-boundness of controls u i (·) and \(\tilde {u}_{j}(\cdot),j\neq i.\) From (63) and (17), noting \((\bar {x}(t)_{\tilde {x}_{0}},k(t)_{\tilde {x}_{0}},\tilde {x}_{0})=(\bar {x}(t)_{\hat {x}_{0}},k(t)_{\hat {x}_{0}},\hat {x}_{0})\), we get
$$\left\{ \begin{aligned} d\left(\check{m}^{(N-1)}(t)-\bar{x}(t)_{\hat{x}_{0}}\right)=&\left[\frac{N-1}{N}D\left(\check{m}^{(N-1)}(t)-\bar{x}(t)_{\hat{x}_{0}}\right)-\frac{D}{N}\bar{x}(t)_{\hat{x}_{0}}\right]dt\\ &+\frac{1}{N-1}\sum\limits_{j=1,j\neq i}^{N}\sigma {dW}_{j}(t),\\ \check{m}^{(N-1)}(0)-\bar{x}(0)_{\hat{x}_{0}}=&x^{(N-1)}_{0}-x. \end{aligned}\right. $$

Therefore (66) is obtained. □

Based on Proposition 4.2, we obtain more direct estimates to prove Theorem 4.1.

Lemma 4.4

For fixed i,1≤iN, we have
$$\begin{array}{@{}rcl@{}} \sup_{0\leq t\leq T} & \mathbb{E}\Big|m^{(N)}(t)-\bar{x}(t)_{\hat{x}_{0}}\Big|^{2}=O\left(\frac{1}{N}\right), \end{array} $$
$$\begin{array}{@{}rcl@{}} \sup_{0\leq t\leq T} & \!\!\!\!\!\!\! \mathbb{E}\Big|m_{i}(t)-m'_{i}(t)\Big|^{2}=O\left(\frac{1}{N}\right), \end{array} $$
$$\begin{array}{@{}rcl@{}} & {} \Big|J_{i}(u_{i},\tilde{u}_{-i})-\bar{J}_{i}(u_{i})\Big| =O\left(\frac{1}{\sqrt{N}}\right). \end{array} $$


(67) follows from Proposition 4.2 directly. From (50) and (58), we get (68) by applying (67). Further, we have
$$\begin{aligned} \sup_{0\leq t\leq T}\mathbb{E}\Big||m_{i}(t)|^{2}-|m'_{i}(t)|^{2}\Big|=O\left(\frac{1}{\sqrt{N}}\right). \end{aligned} $$
In addition,
$${} \begin{aligned} &\sup_{0\leq t\leq T}\mathbb{E}\left|\left(m_{i}(t)-m^{(N)}(t)\right)^{2}-\left(m^{'}_{i}(t)-\bar{x}(t)_{\hat{x}_{0}}\right)^{2}\right|\\ \leq&\sup_{0\leq t\leq T}\mathbb{E}\Big|m_{i}(t)-m^{'}_{i}(t)-\left(m^{(N)}(t)-\bar{x}(t)_{\hat{x}_{0}}\right)\Big|^{2}\\ &\!\!+2\!\left(\sup_{0\leq t\leq T}\!\mathbb{E}\big|m^{'}_{i}(t)\,-\, \bar{x}(t)_{\hat{x}_{0}}\big|^{2}\!\right)^{\!\frac{1}{2}}\!\!\left(\sup_{0\leq t\leq T}\!\mathbb{E}\big|m_{i}(t) \,-\, m^{'}_{i}(t) \,-\, \left(\! m^{(N)}(t) \,-\, \bar{x}(t)_{\hat{x}_{0}}\right)\big|^{2}\!\right)^{\frac{1}{2}}\\ =\,&O\left(\frac{1}{\sqrt{N}}\right). \end{aligned} $$
Then we have
$$\begin{aligned} &\Big|J_{i}(u_{i},\tilde{u}_{-i})-\bar{J}_{i}(u_{i})\Big|\\ \leq&\quad\frac{1}{2}\mathbb{E}{\int_{0}^{T}} Q\Big|\left(m_{i}(t)-m^{(N)}(t)\right)^{2}-\left(m'_{i}(t)-\bar{x}(t)_{\hat{x}_{0}}\right)^{2}\Big|dt\\ &+ \frac{1}{2}H\mathbb{E}\Big|{m_{i}^{2}}(T)-\left(m'_{i}(T)\right)^{2}\Big|\\ =\,& O\left(\frac{1}{\sqrt{N}}\right), \end{aligned} $$
which implies (69). □

Proof of Theorem

4.1: Now, we consider the ε-Nash equilibrium for \(\mathcal {A}_{0}\) and \(\mathcal {A}_{i},1\leq i\leq N\). Combining (42) and (57), we have
$$\begin{aligned} J_{0}(\tilde{u}_{0},\tilde{u}_{-0})&=\bar{J}_{0}(\bar{u}_{0})+O\left(\frac{1}{\sqrt{N}}\right)\\ &\leq \bar{J}_{0}(u_{0})+O\left(\frac{1}{\sqrt{N}}\right)\\ &=J_{0}(u_{0},\tilde{u}_{-0})+O\left(\frac{1}{\sqrt{N}}\right). \end{aligned} $$
It follows from (46) and (69) that
$$\begin{aligned} J_{i}(\tilde{u}_{i},\tilde{u}_{-i})&=\bar{J}_{i}(\bar{u}_{i})+O\left(\frac{1}{\sqrt{N}}\right)\\ &\leq \bar{J}_{i}(u_{i})+O\left(\frac{1}{\sqrt{N}}\right)\\ &=J_{i}(u_{i},\tilde{u}_{-i})+O\left(\frac{1}{\sqrt{N}}\right). \end{aligned} $$

Thus, Theorem 4.1 follows by taking \(\epsilon =O\left (\frac {1}{\sqrt {N}}\right)\). □

Conclusion and future work

In this paper, we have studied the mean-field linear-quadratic (LQ) games with major and minor agents in a backward-forward setup. The main features of our work are as follows. Unlike other mean-field game literature: (1) Here, the major and minor agents are endowed with different objective patterns: the major agent (say, the local government) aims to fulfill some prescribed future target, thus it is facing a “backward” LQ problem by minimizing the initial endowment. On the other hand, the minor agents (say, the individual producers or firms) are still facing a family of “forward” LQ problems, but their state-average is affected by the major agent’s state. (2) Accordingly, the state dynamics of the major agent satisfies some backward stochastic differential equation (BSDE) while the minor agents are modeled by some (forward) stochastic differential equations (SDEs). (3) To derive the decentralized strategies, the mean-field game is formulated in the backward-forward and the major-minor framework. An auxiliary mean-field SDE and a mixed backward-forward stochastic differential equation (BFSDE) are thus introduced and analyzed. An essential feature to BFSDE, compared to the forward-backward SDE (FBSDE), is that there is no feasible decoupling structure via the traditional Riccati equations. This feature brings some technical difficulties to our analysis and new structure to our strategies (specifically, the major’s strategy is open-looped, whereas the minors’ are still closed-looped). (4) In contrast to other mean-field games, the consistency condition is not directly analyzed via fixed-point analysis and contraction mapping. Instead, it is connected to the well-posedness of the mixed BFSDE system and is obtained under some weak monotonic conditions. The decentralized strategies are also verified to satisfy the ε-Nash equilibrium property. For this purpose, some estimates of BFSDE are applied.

In the future, one possible direction is that state-average appears in the dynamics of the major player, which may bring lots of trouble to prove the ε-Nash equilibrium property. The well-posedness of the corresponding 3×2 mixed FBSDE system is also worth research. Another direction is that the dynamics of minor players are formulated by BSDEs. In this case, the consistent condition analysis may be more complicated and technical difficulties may arise. Numerical computation and other applications in finance will also be investigated in future work.



J. Huang acknowledges the financial support partly by RGC Grant 502412, 15300514, G-YL04. Z. Wu acknowledges the Natural Science Foundation of China (61573217), 111 project (B12023), the National High-level personnel of special support program and the Chang Jiang Scholar Program of Chinese Education Ministry.

Authors’ contributions

JH carried out the problem formulation and mean-field backward-forward system analysis, participated in the arguments of approximate Nash equilibrium and fixed-point analysis of certainty principle. SW carried out the deduction of consistency condition and its wellposedness analysis, participated in the draft of manuscripts. ZW carried out related maximum principle and decentralized optimality analysis, participated in the analysis of consistency condition and its connection to forward-backward stochastic system. All authors read and approved the final manuscript.

Competing interests

The authors declare that they have no competing interests.

Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (, which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.

Authors’ Affiliations

Department of Applied Mathematics, The Hong Kong Polytechnic University
School of Mathematics, Shandong University


  1. Andersson, D, Djehiche, B: A maximum principle for SDEs of mean-field type, Appl. Math. Optim. 63, 341–356 (2011).MathSciNetView ArticleMATHGoogle Scholar
  2. Antonelli, F: Backward-forward stochastic differential equations. Ann. Appl. Probab. 3, 777–793 (1993).MathSciNetView ArticleMATHGoogle Scholar
  3. Bardi, M: Explicit solutions of some linear-quadratic mean field games. Netw. Heterog. Media. 7, 243–261 (2012).MathSciNetView ArticleMATHGoogle Scholar
  4. Bensoussan, A, Sung, K, Yam, S, Yung, S: Linear-quadratic mean-field games. J. Optim. Theory Appl. 169, 496–529 (2016).MathSciNetView ArticleMATHGoogle Scholar
  5. Bismut, J: An introductory approach to duality in optimal stochastic control. SIAM Rev. 20, 62–78 (1978).MathSciNetView ArticleMATHGoogle Scholar
  6. Buckdahn, R, Cardaliaguet, P, Quincampoix, M: Some recent aspects of differential game theory. Dynam Games Appl. 1, 74–114 (2010).MathSciNetView ArticleMATHGoogle Scholar
  7. Buckdahn, R, Djehiche, B, Li, J: A general stochastic maximum principle for SDEs of mean-field type. Appl. Math. Optim. 64, 197–216 (2011).MathSciNetView ArticleMATHGoogle Scholar
  8. Buckdahn, R, Djehiche, B, Li, J, Peng, S: Mean-field backward stochastic differential equations: a limit approach. Ann. Probab. 37, 1524–1565 (2009a).
  9. Buckdahn, R, Li, J, Peng, S: Mean-field backward stochastic differential equations and related partial differential equations, Stoch. Process. Appl. 119, 3133–3154 (2009b).
  10. Buckdahn, R, Li, J, Peng, S: Nonlinear stochastic differential games involving a major player and a large number of collectively acting minor agents. SIAM J. Control Optim. 52, 451–492 (2014).MathSciNetView ArticleMATHGoogle Scholar
  11. Carmona, R, Delarue, F: Probabilistic analysis of mean-field games. SIAM J. Control Optim. 51, 2705–2734 (2013).MathSciNetView ArticleMATHGoogle Scholar
  12. Cvitanić, J, Ma, J: Hedging options for a large investor and forward-backward SDE’s. Ann. Appl. Probab. 6, 370–398 (1996).MathSciNetView ArticleMATHGoogle Scholar
  13. Duffie, D, Epstein, L: Stochastic differential utility. Econometrica. 60, 353–394 (1992).MathSciNetView ArticleMATHGoogle Scholar
  14. El Karoui, N, Peng, S, Quenez, M: Backward stochastic differential equations in finance. Math.Finance. 7, 1–71 (1997).MathSciNetView ArticleMATHGoogle Scholar
  15. Espinosa, G, Touzi, N: Optimal investment under relative performance concerns. Math. Finance. 25, 221–257 (2015).MathSciNetView ArticleMATHGoogle Scholar
  16. Guéant, O, Lasry, J-M, Lions, P-L: Mean field games and applications, Paris-Princeton lectures on mathematical finance. Springer, Berlin (2010).Google Scholar
  17. Huang, M: Large-population LQG games involving a major player: the Nash certainty equivalence principle. SIAM J. Control Optim. 48, 3318–3353 (2010).MathSciNetView ArticleMATHGoogle Scholar
  18. Huang, M, Caines, P, Malhamé, R: Large-population cost-coupled LQG problems with non-uniform agents: individual-mass behavior and decentralized ε-Nash equilibria. IEEE Trans. Autom. Control. 52, 1560–1571 (2007).View ArticleGoogle Scholar
  19. Huang, M, Caines, P, Malhamé, R: Social optima in mean field LQG control: centralized and decentralized strategies. IEEE Trans. Autom. Control. 57, 1736–1751 (2012).MathSciNetView ArticleGoogle Scholar
  20. Huang, M, Malhamé, R, Caines, P: Large population stochastic dynamic games: closed-loop McKean-Vlasov systems and the Nash certainty equivalence principle. Commun. Inf. Syst. 6, 221–251 (2006).MathSciNetMATHGoogle Scholar
  21. Hu, Y, Peng, S: Solution of forwardbackward stochastic differential equations. Proba. Theory Rel. Fields. 103, 273–283 (1995).MathSciNetView ArticleGoogle Scholar
  22. Lasry, J-M, Lions, P-L: Mean field games. Japan J. Math. 2, 229–260 (2007).MathSciNetView ArticleMATHGoogle Scholar
  23. Li, T, Zhang, J: Asymptotically optimal decentralized control for large population stochastic multiagent systems. IEEE Trans. Autom. Control. 53, 1643–1660 (2008).MathSciNetView ArticleGoogle Scholar
  24. Lim, E, Zhou, XY: Linear-quadratic control of backward stochastic differential equations. SIAM J. Control Optim. 40, 450–474 (2001).MathSciNetView ArticleMATHGoogle Scholar
  25. Ma, J, Protter, P, Yong, J: Solving forward-backward stochastic differential equations explicitly-a four step scheme, Proba. Theory Rel. Fields. 98, 339–359 (1994).MathSciNetView ArticleMATHGoogle Scholar
  26. Ma, J, Wu, Z, Zhang, D, Zhang, J: On well-posedness of forward-backward SDEs-a unified approach. Ann. Appl. Probab. 25, 2168–2214 (2015).MathSciNetView ArticleMATHGoogle Scholar
  27. Ma, J, Yong, J: Forward-Backward Stochastic Differential Equations and Their Applications. Springer-Verlag, Berlin Heidelberg (1999).MATHGoogle Scholar
  28. Nguyen, S, Huang, M: Linear-quadratic-Gaussian mixed games with continuum-parametrized minor players. SIAM J. Control Optim. 50, 2907–2937 (2012).MathSciNetView ArticleMATHGoogle Scholar
  29. Nourian, M, Caines, P: ε-Nash mean field game theory for nonlinear stochastic dynamical systems with major and minor agents. SIAM J. Control Optim. 51, 3302–3331 (2013).MathSciNetView ArticleMATHGoogle Scholar
  30. Pardoux, E, Peng, S: Adapted solution of backward stochastic equation. Syst. Control Lett. 14, 55–61 (1990).MathSciNetView ArticleMATHGoogle Scholar
  31. Peng, S, Wu, Z: Fully coupled forward-backward stochastic differential equations and applications to optimal control, SIAM. J. Control Optim. 37, 825–843 (1999).MathSciNetView ArticleMATHGoogle Scholar
  32. Wang, G, Wu, Z: The maximum principles for stochastic recursive optimal control problems under partial information. IEEE Trans. Autom. Control. 54, 1230–1242 (2009).MathSciNetView ArticleGoogle Scholar
  33. Wu, Z: A general maximum principle for optimal control of forward-backward stochastic systems. Automatica. 49, 1473–1480 (2013).MathSciNetView ArticleMATHGoogle Scholar
  34. Yong, J: Finding adapted solutions of forward-backward stochastic differential equations: method of continuation. Proba. Theory Rel. Fields. 107, 537–572 (1997).MathSciNetView ArticleMATHGoogle Scholar
  35. Yong, J: Optimality variational principle for controlled forward-backward stochastic differential equations with mixed initial-terminal conditions. SIAM J. Control Optim. 48, 4119–4156 (2010).MathSciNetView ArticleMATHGoogle Scholar
  36. Yong, J, Zhou, XY: Stochastic Controls: Hamiltonian Systems and HJB Equations. Springer-Verlag, New York (1999).View ArticleMATHGoogle Scholar
  37. Yu, Z: Linear-quadratic optimal control and nonzero-sum differential game of forward-backward stochastic system. Asian J.Control. 14, 173–185 (2012).MathSciNetView ArticleMATHGoogle Scholar


© The Author(s) 2016