# Backward-forward linear-quadratic mean-field games with major and minor agents

- Jianhui Huang
^{1}, - Shujun Wang
^{2}and - Zhen Wu
^{2}Email author

**1**:8

https://doi.org/10.1186/s41546-016-0009-9

© The Author(s) 2016

**Received: **4 April 2016

**Accepted: **12 September 2016

**Published: **1 December 2016

## Abstract

This paper studies the backward-forward linear-quadratic-Gaussian (LQG) games with major and minor agents (players). The state of major agent follows a linear *backward* stochastic differential equation (BSDE) and the states of minor agents are governed by linear *forward* stochastic differential equations (SDEs). The major agent is dominating as its state enters those of minor agents. On the other hand, all minor agents are individually negligible but their state-average affects the cost functional of major agent. The mean-field game in such *backward-major* and *forward-minor* setup is formulated to analyze the decentralized strategies. We first derive the consistency condition via an auxiliary mean-field SDEs and a 3×2 mixed backward-forward stochastic differential equation (BFSDE) system. Next, we discuss the wellposedness of such BFSDE system by virtue of the monotonicity method. Consequently, we obtain the decentralized strategies for major and minor agents which are proved to satisfy the *ε*-Nash equilibrium property.

### Keywords

Backward-forward stochastic differential equation (BFSDE) Consistency condition*ε*-Nash equilibrium Large-population system Major-minor agent Mean-field game

## Introduction

*major*agent \(\mathcal {A}_{0}\) and

*minor*agents \(\{\mathcal {A}_{i}\}_{i=1}^{N}\):

*J*

_{0},

*J*

_{ i }as follows:

Formal assumptions on coefficients of states and costs will be given later. As addressed in (Carmona and Delarue 2013) and (Nourian and Caines 2013), the standard procedure of MFG (without \(\mathcal {A}_{0}\)) mainly consists of the following steps:

(**Step i**) Fix the state-average limit: \({\lim }_{N\longrightarrow +\infty } x^{(N)}\) by a frozen process \(\bar {x}\) and formulate an auxiliary stochastic control problem for \(\mathcal {A}_{i}\) which is parameterized by \(\bar {x}\).

(**Step ii**) Solve the above auxiliary stochastic control problem to obtain the decentralized optimal state \(\bar {x}_{i}\) (which should depend on the undetermined process \(\bar {x}\), hence denoted by \(\bar {x}_{i}(\bar {x})\)).

(**Step iii**) Determine \(\bar {x}\) by the fixed-point argument: \({\lim }_{N\longrightarrow +\infty } \frac {1}{N}\sum _{i=1}^{N}\bar {x}_{i} (\bar {x})=\bar {x}\).

As to the MFG with major-minor agent \((\mathcal {A}_{0}, \mathcal {A}_{i})\), **Step (ii)** can be further divided into:

(**Step ii-a**) First, solve the decentralized control problem for \(\mathcal {A}_{0}\) by replacing *x*
^{(N)} using \(\bar {x}.\) The related decentralized optimal state is denoted by \(\bar {x}_{0}(\bar {x})\) and optimal control by \(\bar {u}_{0}(\bar {x}).\)

(**Step ii-b**) Second, given \(\bar {x}_{0}(\bar {x})\) and \(\bar {u}_{0}(\bar {x})\) of \(\mathcal {A}_{0}\), solve the auxiliary stochastic control problem for \(\mathcal {A}_{i}\). The related decentralized states \(\bar {x}_{i}\) for \(\mathcal {A}_{i}\) should depend on \((\bar {x}, \bar {x}_{0}(\bar {x}))\), hence denoted by \(\bar {x}_{i}(\bar {x}, \bar {x}_{0}(\bar {x})\)).

(**Step iii**) is thus revised to fixed-point argument: \({\lim }_{N\longrightarrow +\infty } \frac {1}{N}\sum _{i=1}^{N}\bar {x}_{i} (\bar {x}, \bar {x}_{0}(\bar {x}))=\bar {x}.\)

The MFG with major-minor agent has been extensively studied: for example, Huang (2010) discussed MFG with a major agent and heterogenous minor agents parameterized by finite *K* classes; Nguyen and Huang (2012) further considered MFG with heterogenous minor agents parameterized by a continuum index set; Nourian and Caines (2013) studied MFG for nonlinear large population system involving major-minor agents; Buckdahn et al. (2014) discussed the MFG with major-minor agents in weak formulation where the “feedback control against feedback control” strategies are studied.

Unlike forward SDE with given initial condition *x*
_{0}, the terminal condition *ξ* is pre-specified in BSDE as a priori and its solution becomes an adapted process pair (*x*
_{0},*z*
_{0}). The linear BSDEs were first introduced by Bismut (1978) and the general nonlinear BSDE was first studied in Pardoux and Peng (1990). The BSDE has been applied broadly in many fields such as mathematical economics and finance, decision making and management science. One example is the representation of stochastic differential recursive utility by a class of BSDE (Duffie and Epstein (1992), El Karoui et al. (1997), Wang and Wu (2009), etc.). A BSDE coupled with a SDE in their terminal conditions formulates the forward-backward stochastic differential equation (FBSDE). The FBSDE has also been well studied and the interested readers may refer Antonelli (1993), Cvitanić and Ma (1996), Hu and Peng (1995), Ma et al. (1994, 2015), Ma and Yong (1999), Peng and Wu (1999), Wu (2013), Yong (1997, 2010), Yong and Zhou (1999), Yu (2012) and the references therein for more details of FBSDEs.

The modeling of major agent by BSDE and minor agents by forward SDE, is well motivated and can be illustrated by the following example. In a natural resource exploitation industry, there exist a large number of small exploitation firms \(\{\mathcal {A}_{i}\}_{i=1}^{N}\) which are more aggressive in their business activities. Accordingly, their cost functionals are based on forward SDEs with given initial conditions. Here, these initial conditions can be interpreted as their initial investments or deposits for exploitation licenses. On the other hand, the major agent \(\mathcal {A}_{0}\) acts as some dominating administration party such as local government or regulation bureau. As the administrator, \(\mathcal {A}_{0}\) is more conservative hence its state can be modeled by a linear BSDE for which the terminal condition is specified. Such terminal condition can be interpreted as a future target or objective such as tax revenue from exploitation industry, or environmental protection index related to natural resource.

The modeling of backward-major and forward-minors will yield a large-population system with backward-forward stochastic differential equation (BFSDE), which is structurally different to FBSDE in the following aspects. First, the forward and backward equations will be coupled in their initial instead terminal conditions. Second, unlike FBSDE, there is no feasible decoupling structure by the standard Riccati equations, as addressed in Lim and Zhou (2001). This is mainly because some implicit constraints in initial conditions should be satisfied in the possible decoupling.

The introduction of BFSDE also brings some technical differences to its MFG studies. First, as addressed in (**Step i**), the state-average limit of minor agents will be frozen. Then, by (**ii-a**), the optimal state of major agent should follow a BFSDE system. This is because the major state follows some BSDE, thus its adjoint process should be a forward SDE. These two equations will be further coupled in their initial conditions. Therefore, we will get some BFSDE instead the classical FBSDE from standard *forward major-forward minor* MFG. Next, as suggested by (**ii-b**), the given minor agent will solve some optimal control problem with augmented state: its own state, state-average limit, optimal state of major agent from (**ii-a**), which is a BFSDE. The minor agent’s optimal control should involve some feedback of this augmented state. In this way, the minor’s optimal state will be represented through some coupled system of its own state, the major’s agent, the state-average limit as well as one inhomogeneous equation (which is another BSDE because the state-average limit depends on major’s agent, thus it should be a random process in general). Last, as specified in (**iii**), taking summation of all individual minor agents’ states should reduce to the state-average limit frozen in (**i**). Consequently, more complicated consistency condition system should be derived in our current backward major-forward minor setup.

Based on the above step scheme, the related mean-field LQG games for backward-major and forward-minor system will be proceeded rather differently, comparing to the standard MFG analysis for forward major-minor systems. In particular, the decentralized strategies for major and minor agents will be based on a new consistency condition (see our analysis in Section “The limiting optimal control and NCE equation system”). Accordingly, a stochastic process which relates to state of major player is introduced here to approximate the state-average. An auxiliary mean-field SDE and a 3×2 FBSDE system are introduced and analyzed. Here, the 3×2 FBSDE, which is also called a triple FBSDE, comprises three forward and three backward equations. Applying the monotonic method in Peng and Wu (1999) and Yu (2012), we obtain the wellposedness of this FBSDE. In addition, the decoupling of backward-forward SDE using Riccati equation is also different to that of standard forward-backwards SDE. The *ε*-Nash equilibrium property of decentralized control strategy with \(\epsilon =O(1/\sqrt N)\) is also derived.

The rest of this paper is organized as follows. Section “Preliminaries and problem formulation” formulates the large population LQG games of backward-forward systems. In Section “The limiting optimal control and NCE equation system”, the limiting optimal controls of the track systems and consistency conditions are derived. Section “
*ε*-Nash equilibrium analysis” is devoted to the related *ε*-Nash equilibrium property. “Conclusion and future work section” serves as a conclusion to our study.

## Preliminaries and problem formulation

Throughout this paper, we denote by \(\mathbb {R}^{m}\) the *m*-dimensional Euclidean space. Consider a finite time horizon [0,*T*] for a fixed *T*>0. Suppose \((\Omega, \mathcal F, \{\mathcal F_{t}\}_{0\leq t\leq T}, P)\) is a complete filtered probability space on which a standard (*d*+*m*×*N*)-dimensional Brownian motion {*W*
_{0}(*t*),*W*
_{
i
}(*t*), 1≤*i*≤*N*}_{0≤t≤T
} is defined. We define \(\mathcal F^{w_{0}}_{t}:=\sigma \{W_{0}(s), 0\leq s\leq t\}, \mathcal F^{w_{i}}_{t}:=\sigma \{W_{i}(s), 0\leq s\leq t\}, \mathcal {F}^{i}_{t}:=\sigma \{W_{0}(s),W_{i}(s);0\leq s\leq t\}\). Here, \(\{\mathcal F^{w_{0}}_{t}\}_{0\leq t\leq T}\) represents the information of the major player, while \(\{\mathcal F^{w_{i}}_{t}\}_{0\leq t\leq T}\) the individual information of *i*
^{
t
h
} minor player. For a given filtration \(\{\mathcal G_{t}\}_{0\leq t\leq T},\) let \(L^{2}_{\mathcal {G}_{t}}(0, T; \mathbb {R}^{m})\) denote the space of all \(\mathcal {G}_{t}\)-progressively measurable processes with values in \(\mathbb {R}^{m}\) satisfying \(\mathbb {E}{\int _{0}^{T}}|x(t)|^{2}dt<+\infty ; L^{2}(0, T; \mathbb {R}^{m})\) denote the space of all deterministic functions defined on [0,*T*] in \(\mathbb {R}^{m}\) satisfying \({\int _{0}^{T}}|x(t)|^{2}dt<+\infty ; C(0,T;\mathbb {R}^{m})\) denote the space of all continuous functions defined on [0,*T*] in \(\mathbb {R}^{m}\). For simplicity, in what follows we focus on the 1-dimensional processes, which means *d*=*m*=1.

*N*) individual agents, denoted by \(\mathcal {A}_{0}\) and \(\{\mathcal {A}_{i}\}_{1 \leq i \leq N},\) where \(\mathcal {A}_{0}\) stands for the major player, while \(\mathcal {A}_{i}\) stands for

*i*

^{ t h }minor player. For sake of illustration, we restate the states of major-minor agents as follows, and give the necessary assumptions on coefficients. The dynamics of \(\mathcal {A}_{0}\) is given by a BSDE as follows:

*x*

_{ i0}is the initial value of \(\mathcal {A}_{i}\). Here,

*A*

_{0},

*B*

_{0},

*C*

_{0},

*A*,

*B*,

*D*,

*α*,

*σ*are scalar constants. Assume that \(\mathcal F_{t}\) is the augmentation of

*σ*{

*W*

_{0}(

*s*),

*W*

_{ i }(

*s*),

*x*

_{ i0};0≤

*s*≤

*t*,1≤

*i*≤

*N*} by all the

*P*-null sets of \(\mathcal {F}\), which is the full information accessible to the large population system up to time

*t*. Let

*U*

_{ i },

*i*=0,1,2,…,

*N*be subsets of \(\mathbb {R}\). The admissible control strategy \(u_{0}\in \mathcal {U}_{0},u_{i}\in \mathcal {U}_{i}\), where

*u*=(

*u*

_{0},

*u*

_{1},⋯,

*u*

_{ N }) denote the set of control strategies of all (1+

*N*) agents;

*u*

_{−0}=(

*u*

_{1},

*u*

_{2},⋯,

*u*

_{ N }) the control strategies except \(\mathcal {A}_{0}\);

*u*

_{−i }=(

*u*

_{0},

*u*

_{1},⋯,

*u*

_{ i−1},

*u*

_{ i+1},⋯,

*u*

_{ N }) the control strategies except the

*i*

^{ t h }agent \(\mathcal {A}_{i},1\leq i\leq N\). The cost functional for \(\mathcal {A}_{0}\) is given by

where *Q*≥0,*R*>0,*H*≥0.

###
**Remark 2.1**

Unlike (Huang 2010;Nguyen and Huang 2012;Nourian and Caines 2013), the dynamics of the major agent in our work is a BSDE with a terminal condition as a priori. The term \(H_{0}{x_{0}^{2}}(0)\) is thus introduced in (3) to represent some recursive evaluation. One of its practical implications is the initial hedging deposit in the pension fund industry. For the sake of simplicity, behaviors of the major agent (e.g., the government, as presented in the example above) affect the state of minor agents (which can be understood as numerous individual and negligible firms or producers). Moreover, the major and minor agents are further coupled via the state-average.

###
**Remark 2.2**

The cost functional (3) takes some linear combination weighted by *Q*
_{0} and \(\tilde {Q}.\) Regarding this point, (3) enables us to represent some trade-off between the absolute quadratic cost \({x^{2}_{0}}(t)\) and relative quadratic deviation (*x*
_{0}(*t*)−*x*
^{(N)}(*t*))^{2}. This functional combination can be interpreted as some balance between the minimization of its own cost and the benchmark index tracking to the minor agents’ average. Moreover, such tracking can be framed into the relative performance setting. Similar work can be found in Espinosa and Touzi (2015), where the relative performance is formulated by some convex combination \(\lambda \left (x_{i}(t)-x^{(N)}(t)\right)^{2}+(1-\lambda) {x^{2}_{0}}(t), \lambda \in [0,1]\).

We introduce the following assumption: (H1) \(\{x_{i0}\}_{i=1}^{N}\) are independent and identically distributed (i.i.d) with \(\mathbb {E}x_{i0}=x\), \(\mathbb {E}|x_{i0}|^{2}<+\infty,\) and also independent of {*W*
_{0},*W*
_{
i
},1≤*i*≤*N*}.

It follows that (1) admits a unique solution for all \(u_{0} \in \mathcal {U}_{0}\), (see Pardoux and Peng (1990)). It is also well known that under **(H1)**, (2) admits a unique solution for all \(u_{i} \in \mathcal {U}_{i}, 1\leq i\leq N\). Now, we formulate the large population dynamic optimization problem.

**Problem (I).**Find a control strategies set \(\bar {u}=(\bar {u}_{0},\bar {u}_{1},\cdots,\bar {u}_{N})\) which satisfies

*i*≤

*N*.

## The limiting optimal control and NCE equation system

Combining the major’s state with forcing equation (BSDE with null terminal condition), we naturally have the following formulation of limit representation. To obtain the feedback control and the desired results, we assume \(U_{i}=\mathbb {R}\) for *i*=0,1,2,…,*N*.

*x*

^{(N)}(·) is approximated by \(\bar {x}(\cdot)\) as

*N*→+

*∞*. Introduce the following auxiliary dynamics of major and minor players, still denoted by

*x*

_{0}(·),

*x*

_{ i }(·), respectively:

Thus, we formulate the limiting LQG game **(II)** as follows.

**Problem (II).**For

*i*

^{ t h }agent \(\mathcal {A}_{i}\),

*i*=0,1,2,⋯,

*N*, find \(\bar {u}_{i}\in \mathcal {U}_{i}\) satisfying

\(\bar {u}_{i}\) satisfying (9) is called an optimal control for (**II**).

###
**Remark 3.1**

Since \(\bar {x}(t)\) is regarded as the approximated process of state average *x*
^{(N)}(*t*), we replace *x*
^{(N)}(*t*) by \(\bar {x}(t)\) in Problem **(II)**. In what follows, **(II)** is called the limiting problem of **(I)** as *N*→+*∞*. As referred to at the beginning of this section, we are going to deal with this limiting problem first. Then, we will focus on the *ε*−Nash equilibrium between **(I)** and **(II)**, which is the biggest difference with the usual Nash equilibrium problem.

###
**Remark 3.2**

By noting that each minor player’s state *x*
_{
i
}(*t*) in (2) depends on the major player’s state *x*
_{0}(*t*) explicitly, we claim that the limiting process \(\bar {x}(t)\) also depends on *x*
_{0}(*t*) explicitly. In fact, the third process *k*(*t*) is also meaningful, which is a stochastic process introduced in decoupling the Hamilton system. Hereinafter, we will show it.

###
**Remark 3.3**

Since the state-average of minor players appears only in the cost functional of the major player, the first equation in (5) has the same form as (1), actually. However, for regularity, we still write it out.

To get the optimal control of Problem (**II**), we should obtain the optimal control of \(\mathcal {A}_{0}\) first. We have the following lemma.

###
**Lemma 3.1**

**II**) is given by

*p*

_{0}(·)and the corresponding optimal trajectory \((\hat {x}_{0}(\cdot),\hat {z}_{0}(\cdot))\) satisfy the following Hamilton system

where \(\theta (\cdot),\bar {\theta }(\cdot)\in L^{2}_{\mathcal {F}^{w_{0}}}(0, T; \mathbb {R})\).

###
*Proof*

After obtaining the optimal control of major player \(\mathcal {A}_{0}\), in what follows we aim to get the optimal control \(\bar {u}_{i}\) of minor player \(\mathcal {A}_{i}\), with corresponding optimal trajectory \(\hat {x}_{i}(\cdot)\).

###
**Lemma 3.2**

*p*

_{ i }(·)and the corresponding optimal trajectory \(\hat {x}_{i}(\cdot)\) satisfy BSDE

Here \(\theta _{0}(\cdot),\theta _{i}(\cdot)\in L^{2}_{\mathcal {F}^{i}}(0, T; \mathbb {R})\); \(\hat {x}_{0}(\cdot)\), and \(\bar {x}(\cdot)\) are given by (11). The proof is similar to that of Lemma 3.1 and omitted. For the coupled BFSDE (14) and (15), we are going to decouple it and try to derive the Nash certainty equivalence (NCE) system satisfied by the decentralized control policy. Then we have the following lemma.

###
**Lemma 3.3**

*P*(·) is the unique solution of the following Riccati equation

which is a 3×2 FBSDE.

###
*Proof*

*P*

_{ i }(·),

*f*

_{ i }(·) are to be determined. Here,

*P*

_{ i }(·) is differentiable and

*f*

_{ i }(·) is an Itô process. The terminal condition \(p_{i}(T)= H\hat {x}_{i}(T)\) implies that

*θ*

_{ i }(

*t*)=

*σ*

*P*

_{ i }(

*t*),

Noting that Riccati Eq. (18) is symmetric, it is well known that (18) admits a unique nonnegative bounded solution *P*
_{
i
}(·) (see (Ma and Yong 1999)). Further we get that *P*
_{1}(·)=*P*
_{2}(·)=⋯=*P*
_{
N
}(·):=*P*(·). Thus, (18) coincides with (16). Besides, for given \(\bar {x}(\cdot),\hat {x}_{0}(\cdot)\in L^{2}_{\mathcal F^{w_{0}}}(0,T; \mathbb {R})\), the linear BSDE (19) admits a unique solution \(f_{i}(\cdot)\in L^{2}_{\mathcal {F}^{w_{0}}}(0, T; \mathbb {R})\). We denote *f*
_{
i
}(·):=*f*(·),*i*=1,2,⋯,*N*.

*x*

_{ i }(·) is the state of minor player \(\mathcal {A}_{i}\). Plugging (20) into (2) implies the centralized closed-loop state:

*N*, and letting

*N*→+

*∞*, we get

Then (17) is obtained, which completes the proof. □

###
**Remark 3.4**

The proof of Lemma 3.3 implies that *k*(·)=*f*(·). Thus, *k*(·), which is first introduced in (5), has some specific meaning that it is indeed a force function when decoupling (14) and (15).

To get the wellposedness of (17), we give the following assumption. (H2) \(B_{0}\neq 0,\ H_{0}>0,\ \tilde {Q}>0.\)

###
**Theorem 3.1**

Under (H2), FBSDE (17) is uniquely solvable.

###
*Proof*

Uniqueness.

It is easily checked that (16) admits a unique nonnegative bounded solution (see (Ma and Yong 1999)). For the sake of notational convenience, in (17) we denote by *b*(*ϕ*),*σ*(*ϕ*) the coefficients of drift and diffusion terms, respectively, for \(\phi =p_{0},\bar {x},q\); denote by *f*(*ψ*) the generator for \(\psi =\hat {x}_{0},p,k\).

which implies \(\mathbb {A}(t,\Delta)=\left (A_{0}\hat {x}_{0}-{B_{0}^{2}}R_{0}^{-1}p_{0}+C_{0}\hat {z}_{0},-\left (A+D-B^{2}R^{-1}P(t)\right)\right. p+Q_{0}(\hat {x}_{0}-\bar {x})-\left (Q-DP(t)\right)q,\left (-A+B^{2}R^{-1}P(t)\right)k+\left (Q-DP(t)\right)\bar {x}- \alpha P(t)\hat {x}_{0},-A_{0}p_{0}-Q_{0}(\hat {x}_{0}-\bar {x})-\tilde {Q}\hat {x}_{0}-\alpha p+\alpha P(t)q,\left (A+D-B^{2}R^{-1}P(t)\right)\bar {x}- \left. B^{2}R^{-1}k+\alpha \hat {x}_{0},\left (A-B^{2}R^{-1}P(t)\right)q+B^{2}R^{-1}p,-C_{0}p_{0},0,0{\vphantom {A_{0}\hat {x}_{0}-{B_{0}^{2}}R_{0}^{-1}p_{0}+C_{0}\hat {z}_{0},-(A+D-B^{2}R^{-1}P(t)}}\right).\)

*Δ*and \(\Delta ^{'}=(p'_{0},\bar {x}',q',\hat {x}'_{0},p',k',\hat {z}^{'}_{0},\bar {\theta }^{'},\theta ^{'}_{0})\) are two solutions of (17). Setting \(\hat {\Delta }=(\hat {p}_{0},\hat {\bar {x}},\hat {q},\hat {\hat {x}}_{0},\hat {p},\hat {k},\hat {\hat {z}}_{0},\hat {\bar {\theta }},\hat {\theta }_{0}) =(p_{0}-p'_{0},\bar {x}-\bar {x}',q-q',\hat {x}_{0}-\hat {x}'_{0},p-p',k-k',\hat {z}_{0}-\hat {z}'_{0},\bar {\theta }-\bar {\theta }^{'},\theta _{0}-\theta ^{'}_{0})\) and applying Itô’s formula to \(\langle \hat {p}_{0},\hat {\hat {x}}_{0}\rangle +\langle \hat {\bar {x}},\hat {p}\rangle +\langle \hat {q},\hat {k}\rangle \), we have

By **(H2)**, we get *β*
_{1}>0 and *β*
_{2}>0. Then \(\hat {p}_{0}(s)\equiv 0\), \(\hat {x}_{0}(s)\equiv 0\). Further \(\hat {\hat {z}}_{0}(s)\equiv 0\). Applying the basic technique to \(\hat {\bar {x}}(s)\) and \(\hat {k}(s)\), and using Gronwall’s inequality, we obtain \(\hat {\bar {x}}(s)\equiv 0\), \(\hat {k}(s)\equiv 0\) and \(\hat {\theta }_{0}(s)\equiv 0\). Similarly, we have \(\hat {q}(s)\equiv 0\), \(\hat {p}(s)\equiv 0\), and \(\hat {\bar {\theta }}(s)\equiv 0\). Therefore, (17) admits at most one adapted solution.

*Existence*. In order to prove the existence of the solution, we first consider the following family of FBSDEs parameterized by

*γ*∈[0,1]:

where \((\varphi ^{1},\varphi ^{2},\varphi ^{3},\lambda,\kappa ^{1},\kappa ^{2},\kappa ^{3})\in L^{2}_{\mathcal F^{w_{0}}}(0,T;\mathbb {R}^{7})\), \(a\in L^{2}(\Omega,\mathcal {F}^{w_{0}}_{0},P;\mathbb {R})\). Clearly, when *γ*=1, the existence of (23) implies that of (17). When *γ*=0, it is easy to obtain that (23) admits a unique solution (actually, the 2-dim FBSDE is very similar to the Hamiltonian system of (Lim and Zhou 2001)).

*γ*

_{0}∈[0,1) there exists a unique tuple \((p_{0}^{\gamma _{0}},\bar {x}^{\gamma _{0}},q^{\gamma _{0}},\hat {x}_{0}^{\gamma _{0}},p^{\gamma _{0}},k^{\gamma _{0}}, \hat {z}_{0}^{\gamma _{0}},\bar {\theta }^{\gamma _{0}},\theta _{0}^{\gamma _{0}})\) of (23), then for each

*P*

_{0}and \(P^{\prime }_{0}\) are solutions of SDEs with Itô’s type, applying the usual technique, the estimate for the difference \(\hat {P}_{0}=P_{0}-P'_{0}\) is obtained by

*r*≤

*T*. In the same way, for the difference of the solutions \((\hat {\hat {X}}_{0},\hat {\hat {Z}}_{0})=(\hat {X}_{0}-\hat {X}^{'}_{0},\hat {Z}_{0}-\hat {Z}^{'}_{0}), (\hat {P},\hat {\bar {\Theta }})=(P-P',\bar {\Theta }-\bar {\Theta }')\) and \((\hat {K},\hat {\Theta }_{0})=(K-K',\Theta _{0}-\Theta ^{'}_{0})\), applying the usual technique to the BSDEs, we have

for ∀ 0≤*r*≤*T*. Here the constant *C*
_{1} depends on the coefficients of (1)–(2), *P*(·), *β*
_{1}, *β*
_{2}, and \(\mathcal {T}\). *γ*
_{0}
*H*
_{0}+(1−*γ*
_{0})≥*μ*, *μ*= min(1,*H*
_{0})>0.

**(H2)**, combining (25), (27)–(28), (30)–(31), and applying Gronwall’s inequality, we obtain

*C*

_{2}depends on

*C*

_{1},

*μ*, and

*T*. Choosing \(\delta _{0}=\frac {1}{2C_{2}}\), we get that for each fixed

*δ*∈[0,

*δ*

_{0}], the mapping \(I_{\gamma _{0}+\delta }\) is a contraction in the sense that

*γ*=

*γ*

_{0}+

*δ*. Since

*δ*

_{0}depends only on (

*C*

_{1},

*μ*,

*T*), we can repeat this process

*N*times with 1≤

*N*

*δ*

_{0}<1+

*δ*

_{0}.

Then it follows that, in particular, as *γ*=1 corresponding to \({\varphi ^{i}_{t}}\equiv 0,\lambda _{t}\equiv 0,{\kappa ^{i}_{t}}\equiv 0,a=0\ (i=1,2,3)\), (23) admits a unique solution, which implies the wellposedness of (17) (also (11)). The proof is complete. □

###
**Remark 3.5**

In what follows, (17) is called the Nash certainty equivalence (NCE) equation system (see (Huang 2010;Huang et al. 2007;2012;Huang et al. 2006)). By Theorem 3.1, we know that there exists a unique 9-tuple solution \((p_{0},\bar {x},q,\hat {x}_{0},p,k,\hat {z}_{0},\bar {\theta },\theta _{0})\) which can be obtained off-line. Thus, it is equivalent with the fixed-point principle. To the best of our knowledge, this is the first paper to focus on the well-posedness of coupled FBSDE in large population problems.

##
*ε*-Nash equilibrium analysis

In above sections, we obtained the optimal control \(\bar {u}_{i}(\cdot), 0\le i\le N\) of Problem (**II**) through the consistency condition system. Now, we turn to verify the *ε*-Nash equilibrium of Problem (**I**). To start, we first present the definition of *ε*-Nash equilibrium.

###
**Definition 4.1**

*N*+1) agents is called to satisfy an

*ε*-Nash equilibrium with respect to the costs

*J*

_{ k }, 0≤

*k*≤

*N*, if there exists

*ε*≥0 such that for any fixed 0≤

*i*≤

*N*, we have

when any alternative control \(u^{\prime }_{i}\in \mathcal {U}_{i}\) is applied by \(\mathcal {A}_{i}\).

If *ε*=0, then Definition 4.1 is reduced to the usual Nash equilibrium. Now, we state the main result of this paper and its proof will be given later.

###
**Theorem 4.1**

*ε*-Nash equilibrium of (I). Here, \(\tilde {u}_{0}\) is given by

*p*

_{0}(·)is obtained off-line by (17); while for \(1\le i\le N, \tilde {u}_{i}\) is

where \(\tilde {x}_{i}(\cdot)\), the state trajectory for \(\mathcal {A}_{i}\), satisfies (21).

*h*

_{1}(·),

*h*

_{2}(·)) which is determined by

*h*

_{3}(·). The cost functionals for

**(I)**and

**(II)**are given by

where \((\bar {x}(t)_{\hat {x}_{0}},k(t)_{\hat {x}_{0}})\) satisfies (17). We have

###
**Lemma 4.1**

###
*Proof*

In addition, by (10) and (33), we have \(\tilde {u}_{0}(\cdot)=\hat {u}_{0}(\cdot).\) Thus, (42) is obtained. □

For minor agents, we have

###
**Lemma 4.2**

###
*Proof*

For ∀ 1≤*i*≤*N*, applying Gronwall’s inequality, we get (44) from (41), (37) and (39). (45) follows from (44) and (34), obviously. Using the same technique as (43) and noting \(\sup \limits _{0\leq t\leq T}\mathbb {E}\big |\hat {x}_{i}(t)-\bar {x}(t)_{\hat {x}_{0}}\big |^{2}<+\infty,\sup \limits _{0\leq t\leq T}\mathbb {E}\big |\bar {u}_{i}(t)\big |^{2}<+\infty,\sup \limits _{0\leq t\leq T}\mathbb {E}\big |\hat {x}_{i}(t)\big |^{2}<+\infty \), we obtain (46). □

*ε*-Nash equilibrium for (

**I**). Consider a perturbed control \(u_{0} \in \mathcal {U}_{0}\) for \(\mathcal {A}_{0}\) and introduce the dynamics

*l*

_{0}satisfies

*i*, 1≤

*i*≤

*N*, consider a perturbed control \(u_{i} \in \mathcal {U}_{i}\) for \(\mathcal {A}_{i}\), whereas the major and other minor players keep the control \(\tilde {u}_{j},0\leq j\leq N,j\neq i.\) Introduce the dynamics

*j*≤

*N*,

*j*≠

*i*,

where \(m^{(N)}(t)=\frac {1}{N}\sum \limits _{k=1}^{N}m_{k}(t)\); \(k(t)_{\tilde {x}_{0}}\) satisfies (17) due to \(\tilde {x}_{0}(\cdot)=\hat {x}_{0}(\cdot)\).

*ε*-Nash equilibrium with respect to cost

*J*

_{ j }, it holds that

*L*

^{2}-bounded. Then we obtain the boundedness of \(\bar {J}_{j}(\bar {u}_{j})\), i.e.,

where *C*
_{3} is a positive constant and independent of *N*. Then we have the following proposition.

###
**Proposition 4.1**

\(\sup \limits _{0\leq t\leq T}\!\mathbb {E}\big |l_{0}(t)\big |^{2}\), \(\sup \limits _{1\le k\le N}\!\!\left [\sup \limits _{0\leq t\leq T}\mathbb {E}\big |l_{k}(t)\big |^{2} \right ], \sup \limits _{1\le k\le N}\!\!\left [\sup \limits _{0\leq t\leq T}\mathbb {E}\big |m_{k}(t)\big |^{2} \right ]\) are bounded.

###
*Proof*

*C*

_{4}and

*C*

_{5}are both positive constants. Since \(\sup \limits _{0\leq t\leq T}\mathbb {E}\big |l_{0}(t)\big |^{2}\) is bounded, we get the boundedness of \(\sup \limits _{0\leq t\leq T}\mathbb {E}\big |k(t)_{l_{0}}\big |^{2}\) by (49). It follows from (52) that \(\mathbb {E}|u_{i}(\cdot)|^{2}\) is bounded. Besides, the optimal controls \(\tilde {u}_{k}(\cdot),k\neq i\) is

*L*

^{2}-bounded. Then by Gronwall’s inequality, it follows that

Thus, for any \(1\leq k\leq N, \sup \limits _{0\leq t\leq T}\mathbb {E}|l_{k}(t)|^{2}\) and \(\sup \limits _{0\leq t\leq T}\mathbb {E}|m_{k}(t)|^{2}\) are bounded. Hence the result. □

*u*

_{0}for

**(II)**is as follows

Then we have

###
**Lemma 4.3**

###
*Proof*

From (47) and (53), by the existence and uniqueness of BSDE, for the same perturbed control *u*
_{0}(·), we have \((l^{'}_{0},q'_{0})=(l_{0},q_{0})\). Further, noting FBSDE (49) and (55), we get \((k(t)_{l^{'}_{0}},\bar {x}(t)_{l'_{0}})=(k(t)_{l_{0}},\bar {x}(t)_{l_{0}})\).

and applying Gronwall’s inequality, we get (56). Using the same technique as Lemma 4.1 and noting \(\sup \limits _{0\leq t\leq T}\mathbb {E}\big |l^{'}_{0}(t)-\bar {x}(t)_{l^{'}_{0}}\big |^{2}<+\infty \), we obtain (57). □

*u*

_{ i }for

**(II)**

where \((\bar {x}(t)_{\hat {x}_{0}},k(t)_{\hat {x}_{0}})\) satisfies (17).

**I**) and (

**II**), we need to introduce some intermediate states as

*j*≤

*N*,

*j*≠

*i*,

where \(\check {m}^{(N-1)}(t)=\frac {1}{N-1}\sum \limits _{j=1,j\neq i}^{N}\check {m}_{j}(t)\).

Then we have the following proposition.

###
**Proposition 4.2**

###
*Proof*

**(H1)**and the

*L*

^{2}-boundness of controls

*u*

_{ i }(·) and \(\tilde {u}_{j}(\cdot),j\neq i.\) From (63) and (17), noting \((\bar {x}(t)_{\tilde {x}_{0}},k(t)_{\tilde {x}_{0}},\tilde {x}_{0})=(\bar {x}(t)_{\hat {x}_{0}},k(t)_{\hat {x}_{0}},\hat {x}_{0})\), we get

Therefore (66) is obtained. □

Based on Proposition 4.2, we obtain more direct estimates to prove Theorem 4.1.

###
**Lemma 4.4**

*i*,1≤

*i*≤

*N*, we have

###
*Proof*

###
*Proof of Theorem*

*ε*-Nash equilibrium for \(\mathcal {A}_{0}\) and \(\mathcal {A}_{i},1\leq i\leq N\). Combining (42) and (57), we have

Thus, Theorem 4.1 follows by taking \(\epsilon =O\left (\frac {1}{\sqrt {N}}\right)\). □

## Conclusion and future work

In this paper, we have studied the mean-field linear-quadratic (LQ) games with major and minor agents in a backward-forward setup. The main features of our work are as follows. Unlike other mean-field game literature: (1) Here, the major and minor agents are endowed with different objective patterns: the major agent (say, the local government) aims to fulfill some prescribed future target, thus it is facing a “backward” LQ problem by minimizing the initial endowment. On the other hand, the minor agents (say, the individual producers or firms) are still facing a family of “forward” LQ problems, but their state-average is affected by the major agent’s state. (2) Accordingly, the state dynamics of the major agent satisfies some backward stochastic differential equation (BSDE) while the minor agents are modeled by some (forward) stochastic differential equations (SDEs). (3) To derive the decentralized strategies, the mean-field game is formulated in the backward-forward and the major-minor framework. An auxiliary mean-field SDE and a mixed backward-forward stochastic differential equation (BFSDE) are thus introduced and analyzed. An essential feature to BFSDE, compared to the forward-backward SDE (FBSDE), is that there is no feasible decoupling structure via the traditional Riccati equations. This feature brings some technical difficulties to our analysis and new structure to our strategies (specifically, the major’s strategy is open-looped, whereas the minors’ are still closed-looped). (4) In contrast to other mean-field games, the consistency condition is not directly analyzed via fixed-point analysis and contraction mapping. Instead, it is connected to the well-posedness of the mixed BFSDE system and is obtained under some weak monotonic conditions. The decentralized strategies are also verified to satisfy the *ε*-Nash equilibrium property. For this purpose, some estimates of BFSDE are applied.

In the future, one possible direction is that state-average appears in the dynamics of the major player, which may bring lots of trouble to prove the *ε*-Nash equilibrium property. The well-posedness of the corresponding 3×2 mixed FBSDE system is also worth research. Another direction is that the dynamics of minor players are formulated by BSDEs. In this case, the consistent condition analysis may be more complicated and technical difficulties may arise. Numerical computation and other applications in finance will also be investigated in future work.

## Declarations

### Acknowledgments

J. Huang acknowledges the financial support partly by RGC Grant 502412, 15300514, G-YL04. Z. Wu acknowledges the Natural Science Foundation of China (61573217), 111 project (B12023), the National High-level personnel of special support program and the Chang Jiang Scholar Program of Chinese Education Ministry.

### Authors’ contributions

JH carried out the problem formulation and mean-field backward-forward system analysis, participated in the arguments of approximate Nash equilibrium and fixed-point analysis of certainty principle. SW carried out the deduction of consistency condition and its wellposedness analysis, participated in the draft of manuscripts. ZW carried out related maximum principle and decentralized optimality analysis, participated in the analysis of consistency condition and its connection to forward-backward stochastic system. All authors read and approved the final manuscript.

### Competing interests

The authors declare that they have no competing interests.

**Open Access** This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.

## Authors’ Affiliations

## References

- Andersson, D, Djehiche, B: A maximum principle for SDEs of mean-field type, Appl. Math. Optim. 63, 341–356 (2011).MathSciNetView ArticleMATHGoogle Scholar
- Antonelli, F: Backward-forward stochastic differential equations. Ann. Appl. Probab. 3, 777–793 (1993).MathSciNetView ArticleMATHGoogle Scholar
- Bardi, M: Explicit solutions of some linear-quadratic mean field games. Netw. Heterog. Media. 7, 243–261 (2012).MathSciNetView ArticleMATHGoogle Scholar
- Bensoussan, A, Sung, K, Yam, S, Yung, S: Linear-quadratic mean-field games. J. Optim. Theory Appl. 169, 496–529 (2016).MathSciNetView ArticleMATHGoogle Scholar
- Bismut, J: An introductory approach to duality in optimal stochastic control. SIAM Rev. 20, 62–78 (1978).MathSciNetView ArticleMATHGoogle Scholar
- Buckdahn, R, Cardaliaguet, P, Quincampoix, M: Some recent aspects of differential game theory. Dynam Games Appl. 1, 74–114 (2010).MathSciNetView ArticleMATHGoogle Scholar
- Buckdahn, R, Djehiche, B, Li, J: A general stochastic maximum principle for SDEs of mean-field type. Appl. Math. Optim. 64, 197–216 (2011).MathSciNetView ArticleMATHGoogle Scholar
- Buckdahn, R, Djehiche, B, Li, J, Peng, S: Mean-field backward stochastic differential equations: a limit approach. Ann. Probab. 37, 1524–1565 (2009a).Google Scholar
- Buckdahn, R, Li, J, Peng, S: Mean-field backward stochastic differential equations and related partial differential equations, Stoch. Process. Appl. 119, 3133–3154 (2009b).Google Scholar
- Buckdahn, R, Li, J, Peng, S: Nonlinear stochastic differential games involving a major player and a large number of collectively acting minor agents. SIAM J. Control Optim. 52, 451–492 (2014).MathSciNetView ArticleMATHGoogle Scholar
- Carmona, R, Delarue, F: Probabilistic analysis of mean-field games. SIAM J. Control Optim. 51, 2705–2734 (2013).MathSciNetView ArticleMATHGoogle Scholar
- Cvitanić, J, Ma, J: Hedging options for a large investor and forward-backward SDE’s. Ann. Appl. Probab. 6, 370–398 (1996).MathSciNetView ArticleMATHGoogle Scholar
- Duffie, D, Epstein, L: Stochastic differential utility. Econometrica. 60, 353–394 (1992).MathSciNetView ArticleMATHGoogle Scholar
- El Karoui, N, Peng, S, Quenez, M: Backward stochastic differential equations in finance. Math.Finance. 7, 1–71 (1997).MathSciNetView ArticleMATHGoogle Scholar
- Espinosa, G, Touzi, N: Optimal investment under relative performance concerns. Math. Finance. 25, 221–257 (2015).MathSciNetView ArticleMATHGoogle Scholar
- Guéant, O, Lasry, J-M, Lions, P-L: Mean field games and applications, Paris-Princeton lectures on mathematical finance. Springer, Berlin (2010).Google Scholar
- Huang, M: Large-population LQG games involving a major player: the Nash certainty equivalence principle. SIAM J. Control Optim. 48, 3318–3353 (2010).MathSciNetView ArticleMATHGoogle Scholar
- Huang, M, Caines, P, Malhamé, R: Large-population cost-coupled LQG problems with non-uniform agents: individual-mass behavior and decentralized
*ε*-Nash equilibria. IEEE Trans. Autom. Control. 52, 1560–1571 (2007).View ArticleGoogle Scholar - Huang, M, Caines, P, Malhamé, R: Social optima in mean field LQG control: centralized and decentralized strategies. IEEE Trans. Autom. Control. 57, 1736–1751 (2012).MathSciNetView ArticleGoogle Scholar
- Huang, M, Malhamé, R, Caines, P: Large population stochastic dynamic games: closed-loop McKean-Vlasov systems and the Nash certainty equivalence principle. Commun. Inf. Syst. 6, 221–251 (2006).MathSciNetMATHGoogle Scholar
- Hu, Y, Peng, S: Solution of forwardbackward stochastic differential equations. Proba. Theory Rel. Fields. 103, 273–283 (1995).MathSciNetView ArticleGoogle Scholar
- Lasry, J-M, Lions, P-L: Mean field games. Japan J. Math. 2, 229–260 (2007).MathSciNetView ArticleMATHGoogle Scholar
- Li, T, Zhang, J: Asymptotically optimal decentralized control for large population stochastic multiagent systems. IEEE Trans. Autom. Control. 53, 1643–1660 (2008).MathSciNetView ArticleGoogle Scholar
- Lim, E, Zhou, XY: Linear-quadratic control of backward stochastic differential equations. SIAM J. Control Optim. 40, 450–474 (2001).MathSciNetView ArticleMATHGoogle Scholar
- Ma, J, Protter, P, Yong, J: Solving forward-backward stochastic differential equations explicitly-a four step scheme, Proba. Theory Rel. Fields. 98, 339–359 (1994).MathSciNetView ArticleMATHGoogle Scholar
- Ma, J, Wu, Z, Zhang, D, Zhang, J: On well-posedness of forward-backward SDEs-a unified approach. Ann. Appl. Probab. 25, 2168–2214 (2015).MathSciNetView ArticleMATHGoogle Scholar
- Ma, J, Yong, J: Forward-Backward Stochastic Differential Equations and Their Applications. Springer-Verlag, Berlin Heidelberg (1999).MATHGoogle Scholar
- Nguyen, S, Huang, M: Linear-quadratic-Gaussian mixed games with continuum-parametrized minor players. SIAM J. Control Optim. 50, 2907–2937 (2012).MathSciNetView ArticleMATHGoogle Scholar
- Nourian, M, Caines, P:
*ε*-Nash mean field game theory for nonlinear stochastic dynamical systems with major and minor agents. SIAM J. Control Optim. 51, 3302–3331 (2013).MathSciNetView ArticleMATHGoogle Scholar - Pardoux, E, Peng, S: Adapted solution of backward stochastic equation. Syst. Control Lett. 14, 55–61 (1990).MathSciNetView ArticleMATHGoogle Scholar
- Peng, S, Wu, Z: Fully coupled forward-backward stochastic differential equations and applications to optimal control, SIAM. J. Control Optim. 37, 825–843 (1999).MathSciNetView ArticleMATHGoogle Scholar
- Wang, G, Wu, Z: The maximum principles for stochastic recursive optimal control problems under partial information. IEEE Trans. Autom. Control. 54, 1230–1242 (2009).MathSciNetView ArticleGoogle Scholar
- Wu, Z: A general maximum principle for optimal control of forward-backward stochastic systems. Automatica. 49, 1473–1480 (2013).MathSciNetView ArticleMATHGoogle Scholar
- Yong, J: Finding adapted solutions of forward-backward stochastic differential equations: method of continuation. Proba. Theory Rel. Fields. 107, 537–572 (1997).MathSciNetView ArticleMATHGoogle Scholar
- Yong, J: Optimality variational principle for controlled forward-backward stochastic differential equations with mixed initial-terminal conditions. SIAM J. Control Optim. 48, 4119–4156 (2010).MathSciNetView ArticleMATHGoogle Scholar
- Yong, J, Zhou, XY: Stochastic Controls: Hamiltonian Systems and HJB Equations. Springer-Verlag, New York (1999).View ArticleMATHGoogle Scholar
- Yu, Z: Linear-quadratic optimal control and nonzero-sum differential game of forward-backward stochastic system. Asian J.Control. 14, 173–185 (2012).MathSciNetView ArticleMATHGoogle Scholar