Mixed Deterministic and Random Optimal Control of Linear Stochastic Systems with Quadratic Costs

In this paper, we consider the mixed optimal control of a linear stochastic system with a quadratic cost functional, with two controllers-one can choose only deterministic time functions, called the deterministic controller, while the other can choose adapted random processes, called the random controller. The optimal control is shown to exist under suitable assumptions. The optimal control is characterized via a system of fully coupled forward-backward stochastic differential equations (FB-SDEs) of mean-field type. We solve the FBSDEs via solutions of two (but decoupled) Riccati equations, and give the respective optimal feedback law for both determinis-tic and random controllers, using solutions of both Riccati equations. The optimal state satisfies a linear stochastic differential equation (SDE) of mean-field type. Both the singular and infinite time-horizonal cases are also addressed.


Introduction and formulation of the problem
Let T > 0 be given and fixed. Denote by S n the totality of n × n symmetric matrices, and by S n + its subset of all n × n nonnegative matrices. We mean by an n × n matrix S ≥ 0 that S ∈ S n + and by a matrix S > 0 that S is positive definite. For a matrix-valued function R : [0, T ] → S n , we mean by R ≫ 0 that R(t) is uniformly positive, i.e. there is a positive real number α such that R(t) ≥ αI for any t ∈ [0, T ].
In this paper, we consider the following linear control stochastic differential equation with the following quadratic cost functional Here, (W t ) 0≤t≤T = (W 1 t , · · · , W d t ) 0≤t≤T is a d-dimensional Brownian motion on a probability space (Ω, F , P). Denote by (F t ) the augmented filtration generated by (W t ).
A, B 1 , B 2 , C j , D 1j and D 2j are all bounded Borel measurable functions from [0, T ] to R n×n , R n×l 1 , R n×l 2 , R n×n , R n×l 1 , and R n×l 2 , respectively. Q, R 1 , and R 2 are nonnegative definite, and they are all essentially bounded measurable functions on [0, T ] with values in S n , S l 1 , and S l 2 , respectively. In the first four sections, R 1 and R 2 are further assumed to be positive definite. G ∈ S n is positive semi-definite. For a The process u ∈ L 2 F (0, T ; R l ) is the control, and X ∈ L 2 F (Ω; C(0, T ; R n )) is the corresponding state process with initial value x 0 ∈ R n .
We will use the following notation: S l : the set of symmetric l×l real matrices. L 2 G (Ω; R l ) the set of random variables ξ : (Ω, G) → (R l , B(R l )) with E [|ξ| 2 ] < +∞. L ∞ G (Ω; R l ) is the set of essentially bounded random variables ξ : (Ω, G) → (R l , B(R l )). L 2 the set of essentially bounded {F s } s∈[t,T ] -adapted processes. L 2 F (Ω; C(t, T ; R l )): the set of We will often use vectors and matrices in this paper, where all vectors are column vectors.
For a matrix M, M ′ is its transpose, and |M| = i,j m 2 ij is the Frobenius norm. Define and for a matrix K with suitable dimensions and (t, If both u 1 and u 2 are adapted to the natural filtration of the underlying Brownian , it is well-known that the optimal control exists and can be synthesized into the following feedback of the state: Here K solves the following Riccati equation: See Wonham [10], Haussmann [5], Bismut [2,3], Peng [7], and Tang [8] for more details on the general Riccati equation arising from linear quadratic optimal stochastic control with both state-and control-dependent noises and deterministic coefficients. In this paper, we consider the following situation: there are two controllers called the deterministic controller and the random controller: the former can impose a deterministic action u 1 only, i.e., u 1 ∈ U 1 ad = L 2 (0, T ; R l 1 ); and the latter can impose a random action u 2 , more precisely u 2 ∈ U 2 ad = L 2 F (0, T ; R l 2 ). Firstly, we apply the conventional variational technique to characterize the optimal control via a system of fully coupled forward-backward stochastic differential equations (FBSDEs) of mean-field type. Then we give solution of the FBSDEs with two (but decoupled) Riccati equations, and derive the respective optimal feedback law for both deterministic and random controllers, using solutions of both Riccati equations. Existence and uniqueness is given to both Riccati equations. The optimal state is shown to satisfy a linear stochastic differential equation (SDE) of mean-field type. Both the singular and infinite time-horizonal cases are also addressed.
The rest of the paper is organized as follows. In Section 2, we give the necessary and sufficient condition of the mixed optimal Controls via a system of FBSDEs. In Section 3, we synthesize the mixed optimal control into linear closed forms of the optimal state.
We derive two (but decoupled) Riccati equations, and study their solvability. We state our main result. In Section 4, we address some particular cases. In Section 5, we discuss singular linear quadratic control cases. Finally in Section 6, we discuss the infinite timehorizonal case.

Controls
Let u * be a fixed control and X * be the corresponding state process. For any t ∈ [0, T ), define the processes (p(·), (k j (·)) j=1,··· ,d ) ∈ L 2 F (0, T ; R n ) × (L 2 F (0, T ; R n )) d as the unique solution to The following necessary and sufficient condition can be proved in a straightforward way.
Theorem 2.1 Let u * be the optimal control, and X * be the corresponding solution. Then there exists a pair of adjoint processes (p, k) satisfying the BSDE (2.1). Moreover, the following optimality conditions hold true: and they are also sufficient for u * to be optimal.
Proof. Using the convex perturbation, we obtain in a straightforward way the equivalent condition of the optimal control u * : The sufficient condition can be proved in a standard way.
3 Synthesis of the mixed optimal control 3.1 Ansatz We expect a feedback of the following form Applying Ito's formula, we have Define for i = 1, 2, and Plugging equations (3.2) and (3.4) into the optimality conditions (2.2) and (2.3): From the last equality, we have and consequently In view of (3.8), we have and therefore, We have In view of (3.3) and (2.1), we have We expect the following system for (P 1 , P 2 ): The last equation can be rewritten into the following one: where for S ∈ S n + , We have the following representation for M 1 and M 2 : Proof. First, we show that U(S) ≥ 0. In fact, we have (setting D 2 := S 1/2 D 2 ) Here we have used the following well-known matrix inequality: for D ∈ R n×m , and positive matrices F ∈ S n and R ∈ S m .
Using again the inequality (3.27), we have (setting The proof is complete.

Existence and uniqueness of optimal control
Theorem 3.2 Assume that R 1 ≫ 0 and R 2 ≫ 0. Riccati equations (3.21) and (3.23) have unique nonnegative solutions P 1 and P 2 . The optimal control is unique and has the following feedback form: t The optimal feedback system is given by It is a mean-field stochastic differential equation. The expected optimal state X * t is governed by the following ordinary differential equation: and X * t is governed by the following stochastic differential equation: The optimal value is given by We can check that (X * , u * , p * , k * ) is the solution to FBSDE, satisfying the optimality condition. Hence, u * is optimal.
which is a linear Liapunov equation. Riccati equation (3.23) takes the following form: The optimal control takes the following feedback form:

Some solvable singular cases
In this section, we study the possibility of R 1 = 0 or R 2 = 0. We have Theorem 5.1 Assume that R 1 ≫ 0 and Then Riccati equations (3.21) and (3.23) have unique nonnegative solutions P 1 ≫ 0 and P 2 , respectively. The optimal control is unique and has the following feedback form: The optimal feedback system and the optimal value take identical forms to those of Theorem 3.2.
Proof. In view of the conditions (5.1), the existence and uniqueness of solution P 1 ≫ 0 to Riccati equations (3.21) can be found in Kohlmann and Tang [6, Theorem 3.13, page 1140], and those of solution P 2 ≥ 0 to Riccati equations (3.21) comes from the fact that Λ(P 1 ) ≫ 0 as a consequence of the condition that R 1 ≫ 0.
Other assertions can be proved in an identical manner as Theorem 3.2.
Theorem 5.2 Assume that R 2 ≫ 0 and Then Riccati equations (3.21) and (3.23) have unique nonnegative solutions P 1 ≫ 0 and P 2 , respectively. The optimal control is unique and has the following feedback form: The optimal feedback system and the optimal value take identical forms to those of Theorem 3.2. .
Proof. The existence and uniqueness of solution P 1 to Riccati equations (3.21) are wellknown. In view of the condition G > 0, we have P 1 ≫ 0. We now prove those of solution P 2 ≥ 0 to Riccati equations (3.21).
In view of the well-known matrix inverse formula: for B ∈ R n×m , C ∈ R m×n and invertible matrices A ∈ R n×n , D ∈ R m×m such that A + BD −1 C and D + CA −1 B are invertible, we have the following identity: Noting the condition (D 1 ) ′ D 1 ≫ 0, we have Λ(P 1 ) ≫ 0.
Other assertions can be proved in an identical manner as Theorem 3.2.

The infinite time-horizontal case
In this section, we consider the time-invariant situation of all the coefficients A, B, C, D, Q and R in the linear control stochastic differential equation (SDE) and the quadratic cost functional The admissible class of controls for the deterministic controller u 1 is L 2 (0, ∞; R l 1 ) and for the random controller u 2 is L 2 F (0, ∞; R l 2 ). For simplicity of subsequent exposition, we assume that Q > 0. Assumption 6.1 There is K ∈ R l 2 ×n such that the unique solution X to the following linear matrix stochastic differential equation That is, our linear control system (6.1) is stabilizable using only control u 2 .
We have Lemma 6.2 Assume that Q > 0 and Assumption 6.1 is satisfied. Then, Algebraic Riccati equations and P 2 A(P 1 ) + A ′ (P 1 )P 2 + Q(P 1 ) − P 2 N (P 1 )P 2 = 0 (6.4) have positive solutions P 1 and P 2 . Here for S ∈ S n + , Proof. Existence and uniqueness of positive solution P 1 to Algebraic Riccati equation (6.4) is well-known, and is referred to Wu  For any T > 0, let P T 1 and P T 2 be unique solutions to Riccati equations (3.21) and (3.23), with G = 0. It is well-known that P T 1 converges to the constant matrix P 1 as T → ∞. We now show the convergence of P T 2 . Firstly, P T 2 (t) is nondecreasing in T for any t ≥ 0 due to the following representation formula: for (t, x) ∈ [0, T ] × R n , whose proof is identical to that of the formula (3.33). From Assumption 6.1, it is straightforward to show that there is C t > 0 such that |P T 2 (t)| ≤ C t . Then P T 2 (t) converges to P 2 (t) as T → ∞. Furthermore, since all the coefficients are time-invariant and (P T 1 (T ), P T 2 (T )) = 0 for any T > 0, we have (6.6) P T +s 1 (t + s), P T +s 2 (t + s) = P T 1 (t), P T 2 (t) .
Taking the limit T → ∞ yields that P 2 (t + s) = P 2 (t). Therefore, P 2 is a constant matrix.
Taking the limit T → ∞ in the integral form of Riccati equation (3.23), we show that P 2 solves Algebraic Riccati equation (6.4).
Then the optimal control is unique and has the following feedback form: Define X * t := E[X t ] and X * t := X * t − X * t The optimal feedback system is given by It is a mean-field stochastic differential equation. The expected optimal state X * t is governed by the following ordinary differential equation: and X * t is governed by the following stochastic differential equation: The optimal value is given by (6.11) J(u * ) = P 2 X(0), X(0) .
Proof. The uniqueness of the optimal control is an immediate consequence of the strict convexity of the cost functional in both control variables u 1 and u 2 . We now show that u * is optimal.
For 0 ≤ s ≤ T < ∞, let (u * ,T , X * ,T ) be the optimal pair corresponding to the timehorizon T > 0, and the associated adjoint process is denoted by p T . Using Itô's formula to compute the inner product p T , X * ,n , noting that p T s = P T 1 (s) X * ,T s + P T 2 (s)X * ,T s , we have (6.12) E P T 1 (s) X * ,T s + P T 2 (s)X * ,T s , X * ,T s + J s (u * ,T ) = P T 2 (0)x, x .