We are given the function f(t,x,y,z) defined on \(\left [0,T\right ]\times {\mathcal {R}}^{k}\times {\mathcal {R}}\times {\mathcal {R}}^{k}\), function \(\Phi (x), x\in {\mathcal {R}}^{k}\) and k-dimensional diffusion process (forward)
$$\begin{array}{@{}rcl@{}} \mathrm{d}X_{t}=S\left(\vartheta,t,X_{t}\right)\mathrm{d}t+\varepsilon \sigma \left(t,X_{t}\right)\,\mathrm{d}W_{t}, \quad X_{0}, \; 0\leq t\leq T. \end{array} $$
(9)
Here \(\vartheta \in \Theta \subset {\mathcal {R}}^{d}\), Θ is an open bounded set and \(W_{t}=\left ({W^{1}_{t}},\ldots,{W^{k}_{t}}\right),0\leq t\leq T\) is a standard k-dimensional Wiener process.
Introduce the condition \({\mathfrak L}\).
The functions
f(t,x,y,z), Φ(x), vector
S(𝜗,t,x)=(S
l
(𝜗,t,x), l=1,…,k)and
k×k
matrix
σ(t,x)=(σ
lm
(t,x))are smooth
$$\begin{array}{@{}rcl@{}} &&\left|S\left(\vartheta,t,x\right)-S\left(\vartheta,t,y\right) \right|+\left|\sigma \left(t,x\right)-\sigma \left(t,y\right) \right|\leq L\left|x-y\right|,\\ &&\left|f\left(t,x,y_{1},z_{1}\right)-f\left(t,x,y_{2},z_{2}\right) \right|\leq C \left[\left|y_{1}-y_{2}\right|+ \left|z_{1}-z_{2}\right|\right] \end{array} $$
and satisfy (p>0)
$$\begin{array}{@{}rcl@{}} &&\left| S \left(\vartheta,t,x\right)\right|+\left| \sigma \left(t,x\right)\right|\leq C\left(1+\left|x\right|\right),\\ &&\left|f\left(t,x,y,z\right)\right|+\left|\Phi (x)\right|\leq C\left(1+ \left|x\right|^{p}\right). \end{array} $$
We must find a couple of stochastic processes \(\left (X_{t,\varepsilon }^{\star }, Z_{t,\varepsilon }^{\star },0\leq t\leq T\right)\) which approximate well the solution of the BSDE
$$ \mathrm{d}Y_{t}=-f\left(t,X_{t},Y_{t},Z_{t}\right)\,\mathrm{d}t+Z_{t}\,\mathrm{d}W_{t},\qquad Y_{0},\quad 0\leq t\leq T $$
(10)
satisfying the condition Y
T
=Φ(X
T
).
Let us denote \( x_{t}\left (\vartheta \right)=\left (x_{t}^{(1)}(\vartheta),\ldots, x_{t}^{(k)}(\vartheta)\right),0\leq t\leq T\) the solution of the system of ordinary differential equations
$$\begin{array}{@{}rcl@{}} \frac{\mathrm{d}x_{t}(\vartheta) }{\mathrm{d}t}=S\left(\vartheta,t,x_{t}(\vartheta)\right),\qquad x_{0},\qquad 0\leq t\leq T. \end{array} $$
The true value is 𝜗
0 and we let x
t
=x
t
(𝜗
0). We have the estimates: with probability 1
$$\begin{array}{@{}rcl@{}} \sup\limits_{0\leq t\leq T}\left|X_{t}-x_{t}\right|\leq C\varepsilon \sup\limits_{0\leq t\leq T} \left|W_{t}\right| \end{array} $$
(11)
and for any p>0
$$\begin{array}{@{}rcl@{}} &\sup\limits_{0\leq t\leq T}{\mathbf{E}}_{\vartheta_{0}}\left|X_{t}-x_{t}\right|^{p}\leq C\varepsilon^{p}. \end{array} $$
(12)
For the proof see, e.g., Kutoyants (1994).
We have a family of problems of parameter estimation by observations X
t=(X
s
,0≤s≤t), where t∈(0,T] and therefore we need a family of estimators
\({\bar {\vartheta }}_{t,\varepsilon }, 0<t\leq T\). Let \(\left ({\mathcal {C}}^{k}\left (\left [0,t\right ]\right),{\mathfrak B}_{t}\right)\) be a measurable space of continuous vector-functions on [0,t] with Borelian σ-algebra \({\mathfrak B}_{t} \). Denote by \(\left \{\mathbf {P}_{\vartheta }^{\left (\varepsilon,t \right)},\vartheta \in \Theta \right \} \) the family of measures induced in this space by the solutions of (9) with different 𝜗∈Θ. Note that these measures are equivalent (see Liptser and Shiryaev (2001)) and the likelihood ratio function is
$$\begin{array}{@{}rcl@{}} L\left(\vartheta,X^{t}\right)&=&\frac{\mathrm{d} \mathbf{P}_{\vartheta }^{\left(\varepsilon,t \right)}}{\mathrm{d} \mathbf{P}_{0 }^{\left(\varepsilon,t \right)}}\left(X^{t}\right)=\exp\left\{\frac{1}{\varepsilon^{2}}{{\int\nolimits}_{0}^{t}} S\left(\vartheta,s,X_{s}\right) \mathbb{A} \left(s,X_{s}\right)^{-1}\mathrm{d}X_{t}\right.\\ && \left. -\frac{1}{2\varepsilon^{2}}{{\int\nolimits}_{0}^{t}} S\left(\vartheta,s,X_{s}\right) \mathbb{A} \left(s,X_{s}\right)^{-1}S\left(\vartheta,s,X_{s}\right)\mathrm{d}s\right\},\quad \vartheta \in \Theta. \end{array} $$
Here \(\mathbf {P}_{0 }^{\left (\varepsilon,t \right)} \) is the measure, which corresponds to the observations (9) with S(𝜗,t,X
t
)≡0. The matrix \(\mathbb {A}\left (s,x\right)\) is
$$\begin{array}{@{}rcl@{}} \mathbb{A}_{lm}\left(s,x\right)=\left[\sigma \left(s,x\right)^{*}\sigma \left(s,x\right)\right]_{lm},\qquad l,m=1,\ldots,k. \end{array} $$
Recall that the MLE \({\hat {\vartheta }}_{\varepsilon,t}\) is defined by the equation
$$\begin{array}{@{}rcl@{}} L\left({\hat{\vartheta}}_{\varepsilon,t},X^{t}\right)=\sup\limits_{\vartheta \in \Theta }L\left(\vartheta,X^{t}\right). \end{array} $$
(13)
Introduce the Regularity conditions
\({\mathfrak R}\).
-
1.
The function
S(𝜗,t,x)is two-times continuously differentiable w.r.t.
𝜗
and the derivatives are Lipschitz in x.
-
2.
We suppose that there exists a positive constant m such that for any real
\(\lambda \in {\mathcal {R}}^{k}\)
we have
$$\begin{array}{@{}rcl@{}} m^{-1}\left\|\lambda \right\|^{2}\leq \lambda^{*} \mathbb{A}\left(s,x\right)\lambda \leq m\left\|\lambda \right\|^{2}. \end{array} $$
(14)
-
3.
The Fisher information matrix
$$\begin{array}{@{}rcl@{}} \mathbb{I}_{t}(\vartheta)={{\int\nolimits}_{0}^{t}}\dot{\mathbb{S}}\left(\vartheta,s,x_{s}(\vartheta)\right)^{*}\mathbb{A}\left(s,x_{s}(\vartheta)\right)^{-1}\dot{\mathbb{S}}\left(\vartheta,s,x_{s}(\vartheta)\right)\,\mathrm{d}s, \end{array} $$
is uniformly nondegenerate:
$$\begin{array}{@{}rcl@{}} \inf\limits_{\vartheta \in\Theta }\inf\limits_{\left|\lambda \right|=1} \lambda^{*}\mathbb{I}_{t}(\vartheta)\lambda >0 \end{array} $$
Here
\(\lambda \in {\mathcal {R}}^{d}\)
dot means derivation w.r.t.
𝜗 and \(\dot {\mathbb {S}}\left (\vartheta,s,x\right) \) is k×d-matrix.
-
4.
Identifiability condition : for any
ν>0and any
t∈(0,T]the estimate
$$\begin{array}{@{}rcl@{}} \inf\limits_{\vartheta_{0}\in\Theta }\inf\limits_{\left|\vartheta -\vartheta_{0}\right|>\nu }{{\int\nolimits}_{0}^{t}}\delta \left(s,x_{s},\vartheta,\vartheta_{0}\right)^{*}\mathbb{A}\left(s,x_{s}\right)^{-1}\delta \left(s,x_{s},\vartheta,\vartheta_{0}\right)\mathrm{d}s>0 \end{array} $$
holds. Here δ(s,x
s
,𝜗,𝜗
0)=S(𝜗,s,x
s
)−S(𝜗
0,s,x
s
).
The Regularity conditions allow us to prove the folowing properties of the MLE \({\hat {\vartheta }}_{\varepsilon,t}, t\in (0,T]\).
-
1.
It is uniformly consistent: for any
ν>0and any compact
K⊂Θ
$$\begin{array}{@{}rcl@{}} \lim\limits_{\varepsilon \rightarrow 0}\sup\limits_{\vartheta_{0}\in \textbf{K}} \mathbf{P}_{\vartheta_{0} }^{\left(\varepsilon,t \right)}\left(\left|{\hat{\vartheta}}_{\varepsilon,t}-\vartheta_{0} \right|>\nu \right)=0. \end{array} $$
-
2.
Uniformly on compacts
K⊂Θ
asymptotically normal
$$\begin{array}{@{}rcl@{}} \varepsilon^{-1}\left({\hat{\vartheta}}_{\varepsilon,t}-\vartheta_{0} \right)\Longrightarrow {\mathcal{N}}\left(0,\mathbb{I}_{t}\left(\vartheta_{0} \right)^{-1}\right). \end{array} $$
-
3.
The polynomial moments converge and it is asymptotically efficient.
These properties were established in Kutoyants (1994) in the case of the one-dimensional diffusion processes (9). There is no essential dificulties to apply the same proof in our case. The presented below multi-step MLE-processes have exactly the same asymptotic properties, but can be calculated more easily.
Introduce the family of functions
$$\begin{array}{@{}rcl@{}} {\mathcal{U}}=\left\{\left(u(t,x,\vartheta), t\in \left[0,T\right], x\in {\mathcal{R}}^{k}\right), \vartheta \in \Theta \right\} \end{array} $$
such that for all 𝜗∈Θ the function u(t,x,𝜗) satisfies the PDE
$$\begin{array}{@{}rcl@{}} &&\frac{\partial u}{\partial t}+\sum\limits_{l=1}^{k}S_{l}(\vartheta,t,x)\frac{\partial u}{\partial x_{l}} +\frac{\varepsilon^{2}}{2}\sum\limits_{l,m=1}^{k}\mathbb{A}_{l,m}\left(t,x\right) \frac{\partial^{2} u}{\partial x_{l}\partial x_{m}}\\ &&\qquad \qquad \qquad =-f\left(t,x,u,\varepsilon\sum\limits_{l=1}^{k} \sigma_{lm} (t,x)\frac{\partial u}{\partial x_{m}}\right) \end{array} $$
and the condition \(u(T,x,\vartheta)=\Phi (x), x\in {\mathcal {R}}^{k}\).
The limit of the function u(t,x,𝜗) as ε→0 we denote as u
∘(t,x,𝜗). The function u
∘(t,x,𝜗) satisfies the equation
$$\begin{array}{@{}rcl@{}} &\frac{\partial u_{\circ}}{\partial t}+\sum\limits_{l=1}^{k}S_{l}(\vartheta,t,x)\frac{\partial u_{\circ}}{\partial x_{l}} =-f\left(t,x,u_{\circ},\varepsilon\sum\limits_{l=1}^{k} \sigma_{lm} (t,x)\frac{\partial u_{\circ}}{\partial x_{m}}\right) \end{array} $$
with the final value u
∘(T,x,𝜗)=Φ(x). Below \(\dot u_{\circ }\left (t,x,\vartheta \right) \) and \(\ddot u_{\circ }\left (t,x,\vartheta \right) \) means the derivative of this function w.r.t. 𝜗.
Introduce the condition \({\mathfrak U}\)
-
1.
The function
u(t,x,𝜗)is two-times continuously differentiable w.r.t.
𝜗
and the derivatives
\(\dot u\left (t,x,\vartheta \right)\)
and
\(\ddot u\left (t,x,\vartheta \right)\)
are Lipschitz w.r.t. x unifoormly in
𝜗∈Θ.
-
2.
The function
u(t,x,𝜗)and its derivatives
\(\dot u\left (t,x,\vartheta \right) \)
and
\(\dot u'_{x}\left (t,x,\vartheta \right) \)
converge uniformly in
t∈[0,T]to
\(u_{\circ }\left (t,x,\vartheta \right),\dot u_{\circ }\left (t,x,\vartheta \right),\dot u_{\circ,x}'\left (t,x,\vartheta \right)\)
respectively.
The sufficient conditions providing these properties of u(t,x,𝜗) can be found in Freidlin and Wentzell (1998), Theorem 2.3.1. Note that the derivatives \(\dot u\left (t,x,\vartheta \right)\) and \(\ddot u\left (t,x,\vartheta \right)\) satisfy the linear PDE of the same type.
If we let Y
t
=u(t,X
t
,𝜗), then by Itô’s formula we obtain BSDE (10) with
$$Z_{t}=\left({Z_{t}^{1}},\ldots, {Z_{t}^{k}}\right),\qquad {Z_{t}^{m}}=\varepsilon \sum\limits_{l=1}^{k}\sigma_{ml} \left(t,X_{t}\right) u'_{x_{l}} \left(t,X_{t},\vartheta \right). $$
Recall that our goal is to construct an asymptotically efficient approximation of the couple (Y
t
,Z
t
). To compare all possible estimators we introduce the lower bounds on the mean-square risks. This is a version of the well-known Hajek-Le Cam minimax risk bound (see, e.g., Ibragimov and Has’minskii (1981), Theorem 2.12.1).
Theorem 1
Suppose that the conditions \({\mathfrak L}, {\mathfrak R}\) and \( {\mathfrak U}\) are fulfilled. Then for all estimators \({\bar {Y}}_{t}\) and \({\bar {Z}}_{t}\) and all t∈(0,T] we have the relations
$$\begin{array}{@{}rcl@{}} && {\lim}_{\overline{\nu \rightarrow 0}}{\lim}_{\overline{\varepsilon \rightarrow 0}} \sup\limits_{\left|\vartheta -\vartheta_{0}\right|\leq \nu} \varepsilon^{-2}{\mathbf{E}}_{\vartheta} \left| {\bar{Y}}_{t}-Y_{t}\right|^{2}\geq \dot u_{\circ}\left(t,x_{t},\vartheta_{0}\right)^{*}\mathbb{I}_{t}\left(\vartheta_{0}\right)^{-1}\dot u_{0}\left(t,x_{t},\vartheta_{0}\right), \end{array} $$
(15)
$$\begin{array}{@{}rcl@{}} &&{\lim}_{\overline{\nu \rightarrow 0}}{\lim}_{\overline{\varepsilon \rightarrow 0}} \sup_{\left|\vartheta -\vartheta _{0}\right|\leq \nu} \varepsilon^{-4}{\mathbf{E}}_{\vartheta} \left| {\bar{Z}}_{t}-Z_{t}\right|^{2} \geq \left|{\left(\dot u_{\circ}\right)'_{x}\left(t,x_{t},\vartheta _{0}\right)^{*}{\mathbb{I}_{t}}\left(\vartheta_{0}\right)^{-\frac{1}{2}} \sigma \left(t,x_{t}\right)}\right|^{2}. \end{array} $$
(16)
Proof
We first verify that the family of measures is locally asymptotically normal (LAN) and then we apply the proof of the Hajek-Le Cam lower bound (Ibragimov and Has’minskii 1981), which provides us (15), (16). We present here the necessary modification of the proof given in Ibragimov and Has’minskii (1981). Usually this inequality is considered for the risk like \({\mathbf {E}}_{\vartheta } \left |{\bar {\vartheta }}_{\varepsilon } -\vartheta \right |^{2}\) and we are interested by the risk \({\mathbf {E}}_{\vartheta } \left |{\bar {Y}}_{t,\varepsilon } -Y_{t} \right |^{2}\), where Y
t
is a random process. Another point, the random vector Δ
t
(see below), in general, is asymptotically normal and, in our case, it has Gaussian distribution, that is why the proof is slightly simplified. □
Let us denote \(\varphi _{\varepsilon } =\varepsilon {\mathbb {I}_{t}}^{-\frac {1}{2}} \) where \({\mathbb {I}_{t}}={\mathbb {I}_{t}}\left (\vartheta _{0}\right)\) and introduce the normalized likelihood ratio
$$\begin{array}{@{}rcl@{}} Z_{t,\varepsilon }(v)=\frac{L\left(\vartheta_{0}+\varphi_{\varepsilon} v,X^{t}\right)}{L\left(\vartheta_{0},X^{t}\right)},\qquad v\in V_{\varepsilon} =\left\{v: \vartheta_{0}+\varphi_{\varepsilon} v\in \Theta \right\}. \end{array} $$
We can write
$$\begin{array}{@{}rcl@{}} \ln Z_{t,\varepsilon }(v)&=&\frac{1}{\varepsilon }{{\int\nolimits}_{0}^{t}}\left[S\left(\vartheta_{0}+\varphi_{\varepsilon} v,s,X_{s}\right)-S\left(\vartheta_{0},s,X_{s}\right)\right]\sigma \left(s,X_{s}\right)^{-1}\mathrm{d}W_{s}\\ &&-\frac{1}{2\varepsilon^{2} }{{\int\nolimits}_{0}^{t}}\left|\left[S\left(\vartheta_{0}+\varphi_{\varepsilon} v,s,X_{s}\right)-S\left(\vartheta_{0},s,X_{s}\right)\right]^{*}\sigma \left(s,X_{s}\right)^{-1}\right|^{2}\mathrm{d}s\\ &=&v^{*}\Delta_{t }-\frac{1}{2}\left|v\right|^{2} +r_{\varepsilon}, \end{array} $$
where r
ε
→0 and the vector
$$\begin{array}{@{}rcl@{}} \Delta_{t}={\mathbb{I}_{t}}\left(\vartheta_{0}\right)^{-\frac{1}{2}} {{\int\nolimits}_{0}^{t}}\dot{\mathbb{S}}\left(\vartheta_{0},s,x_{s}\right) \sigma \left(s,x_{s}\right)^{-1}\mathrm{d}W_{s} \sim {\mathcal{N}} \left(0, \mathbb{J} \right). \end{array} $$
Here \(\mathbb {J}\) is a unit d×d matrix.
Hence, the family of measures \(\left \{\mathbf {P}_{\vartheta }^{\left (\varepsilon,t \right)},\vartheta \in \Theta \right \} \) is LAN in Θ (Ibragimov and Has’minskii 1981; Kutoyants 1994).
Below, M>0, \({\mathcal {K}}_{M}\) is a cube in \({\mathcal {R}}^{d}\) whose vertices have coordinates ±M, so that its volume is (2M)d and 𝜗
v
=𝜗
0+φ
ε
v.
$$\begin{array}{@{}rcl@{}} &&\sup\limits_{\left|\vartheta -\vartheta_{0}\right|\leq \nu} {\mathbf{E}}_{\vartheta} \left| {\bar{Y}}_{t,\varepsilon }-Y_{t}\right|^{2}=\sup\limits_{\left|\vartheta -\vartheta_{0}\right|\leq \nu } {\mathbf{E}}_{\vartheta} \left| {\bar{Y}}_{t}-u\left(t,X_{t},\vartheta \right)\right|^{2}\\ &&\qquad \quad =\sup\limits_{\left|\varphi_{\varepsilon} v \right|\leq \nu} {\mathbf{E}}_{\vartheta_{v}} \left| {\bar{Y}}_{t}-u\left(t,X_{t},\vartheta_{0}+\varphi_{\varepsilon} v \right)\right|^{2}\\ &&\qquad \quad\geq \sup\limits_{v\in {\mathcal{K}}_{M}} {\mathbf{E}}_{\vartheta_{v}} \left| {\bar{Y}}_{t}-u\left(t,X_{t},\vartheta_{0}+\varphi_{\varepsilon} v \right)\right|^{2}\\ &&\qquad \quad\geq \frac{1}{\left(2M\right)^{d}}{\int\nolimits}_{{\mathcal{K}}_{M}}^{}{\mathbf{E}}_{\vartheta_{v}} \left| {\bar{Y}}_{t}-u\left(t,X_{t},\vartheta_{0}+\varphi_{\varepsilon} v \right)\right|^{2}\mathrm{d}v\\ &&\qquad \quad =\frac{1}{\left(2M\right)^{d}}{\int\nolimits}_{{\mathcal{K}}_{M}}^{}{\mathbf{E}}_{\vartheta_{0}}Z_{t,\varepsilon }(v) \left| {\bar{Y}}_{t}-u\left(t,X_{t},\vartheta_{0}+\varphi_{\varepsilon} v \right)\right|^{2}\mathrm{d}v. \end{array} $$
We have
$$\begin{array}{@{}rcl@{}} u\left(t,X_{t},\vartheta_{0}+\varphi_{\varepsilon} v \right)&=&u\left(t,X_{t},\vartheta_{0} \right)+ \dot u\left(t,X_{t},\vartheta_{0} \right)^{*}\varphi_{\varepsilon} v\\ &&+\varepsilon {{\int\nolimits}_{0}^{1}}\left[\dot u\left(t,X_{t},\vartheta_{0}+r\varphi_{\varepsilon} v \right)-\dot u\left(t,X_{t},\vartheta_{0} \right)\right]^{*}{\mathbb{I}}_{t}^{-\frac{1}{2}} v\,\mathrm{d}r\\ & =& u\left(t,X_{t},\vartheta_{0} \right)+ \dot u\left(t,X_{t},\vartheta_{0} \right)^{*}\varphi_{\varepsilon} v+\varphi_{\varepsilon} h_{\varepsilon}, \end{array} $$
where |h
ε
|≤C
ε. Hence, if we denote
$${\bar{b}}_{\varepsilon} =\varepsilon^{-1}\left({\bar{Y}}_{t}- u\left(t,X_{t},\vartheta_{0} \right) \right),\qquad \dot u=\dot u\left(t,X_{t},\vartheta_{0} \right)^{*} {\mathbb{I}}_{t}^{-\frac{1}{2}} $$
and introduce such vector \({\bar {v}}_{\varepsilon } \) that \({\bar {b}}_{\varepsilon } = \dot u\left (t,X_{t},\vartheta _{0} \right)^{*}{\mathbb {I}}_{t}^{-\frac {1}{2}}{\bar {v}}_{\varepsilon } \), then we can write
$$\begin{array}{@{}rcl@{}} &&\varepsilon^{-2}{\mathbf{E}}_{\vartheta_{0}}Z_{t,\varepsilon }(v) \left| {\bar{Y}}_{t}-u\left(t,X_{t},\vartheta_{0}+\varphi_{\varepsilon} v \right)\right|^{2}\\ &&\qquad\qquad \qquad\;={\mathbf{E}}_{\vartheta_{0}}Z_{t,\varepsilon }(v) \left| {\bar{b}}_{t}-\dot u\left(t,X_{t},\vartheta_{0} \right)^{*}{\mathbb{I}}_{t}^{-\frac{1}{2}} v\right|^{2}\left(1+O(\varepsilon)\right)\\ &&\qquad\qquad \qquad\; ={\mathbf{E}}_{\vartheta_{0}}Z_{t,\varepsilon }(v) \left| \dot u^{*}\left({\bar{v}}_{\varepsilon}- v\right)\right|^{2}\left(1+O(\varepsilon)\right). \end{array} $$
Further, we use the following result known as Scheffé’s lemma
Lemma 1
Let the random variables Z
ε
≥0,ε∈(0,1] converge in probability to the random variable Z≥0 as ε→0 and E
Z
ε
=E
Z=1, then
$$\begin{array}{@{}rcl@{}} \lim\limits_{\varepsilon \rightarrow 0} {\mathbf{E}} \left|Z_{\varepsilon}- Z\right|=0. \end{array} $$
For the proof see, e.g., Theorem A.4 in Ibragimov and Has’minskii (1981).
Recall that \({\mathbf {E}}_{\vartheta _{0}} Z_{t,\varepsilon }(v)={\mathbf {E}}_{\vartheta _{0}} Z_{t }(v)=1\), where \(\ln Z_{t }(v)=v^{*}\Delta _{t}-\frac {1}{2}\left |v\right |^{2} \). Hence for any K>0
$$\begin{array}{@{}rcl@{}} {\mathbf{E}}_{\vartheta_{0}}Z_{t,\varepsilon }(v) \left| \dot u^{*}\left({\bar{v}}_{\varepsilon}- v\right)\right|_{K}^{2}={\mathbf{E}}_{\vartheta_{0}}Z_{t }(v) \left| \dot u^{*}\left({\bar{v}}_{\varepsilon}- v\right)\right|_{K}^{2}\left(1+o(1)\right). \end{array} $$
Here we denoted \(\left |D\right |^{2}_{K}=\left |D\right |^{2}\wedge K\). These allow us to write
$$\begin{array}{@{}rcl@{}} &&\frac{1}{\left(2M\right)^{d}}{\int\nolimits}_{{\mathcal{K}}_{M}}^{}{\mathbf{E}}_{\vartheta_{0}}Z_{t,\varepsilon }(v) \left| {\bar{Y}}_{t}-u\left(t,X_{t},\vartheta_{0}+\varepsilon v \right)\right|^{2}\mathrm{d}v\\ &&\qquad\geq \frac{1}{\left(2M\right)^{d}}{\int\nolimits}_{{\mathcal{K}}_{M}}^{}{\mathbf{E}}_{\vartheta_{0}}Z_{t,\varepsilon }(v) \left| {\bar{Y}}_{t}-u\left(t,X_{t},\vartheta_{0}+\varepsilon v \right)\right|_{K}^{2}\mathrm{d}v\\ &&\qquad=\frac{1}{\left(2M\right)^{d}}{\int\nolimits}_{{\mathcal{K}}_{M}}^{}{\mathbf{E}}_{\vartheta_{0}}Z_{t }(v)\left| \dot u^{*}\left({\bar{v}}_{\varepsilon}- v\right)\right|_{K}^{2}\mathrm{d}v\left(1+o(1)\right). \end{array} $$
Then
$$\begin{array}{@{}rcl@{}} Z_{t }(v)=\exp\left\{v^{*}\Delta_{t}-\frac{1}{2}\left|v\right|^{2}\right\}=\exp\left\{-\frac{1}{2} \left|v-\Delta_{t} \right|^{2}\right\} \exp\left\{ \frac{1}{2}\left|\Delta_{t}\right|^{2}\right\} \end{array} $$
and
$$\begin{array}{@{}rcl@{}} &&{\mathbf{E}}_{\vartheta_{0}}Z_{t }(v)\left| \dot u^{*}\left({\bar{v}}_{\varepsilon}- v\right)\right|_{K}^{2}={\mathbf{E}}_{\vartheta_{0}}e^{-\frac{1}{2}\left|v-\Delta_{t} \right|^{2}}\left| \dot u^{*} \left(v_{\varepsilon}-v\right)\right|_{K}^{2}e^{\frac{1}{2} \left|\Delta_{t}\right|^{2}}\\ &&\qquad ={\mathbf{E}}_{\vartheta_{0}}e^{-\frac{1}{2}\left| w \right|^{2}}\left| \dot u^{*} \left({\bar{v}}_{\varepsilon}-\Delta_{t}- w\right)\right|_{K}^{2}e^{\frac{1}{2} \left|{\bar{\Delta}}_{t}\right|^{2}}\\ &&\qquad ={\mathbf{E}}_{\vartheta_{0}}e^{-\frac{1}{2}\left| w \right|^{2}}\left| \dot u^{*} \left(\tilde w_{\varepsilon}- w\right)\right|_{K}^{2}e^{\frac{1}{2} \left|{\bar{\Delta}}_{t}\right|^{2}} \end{array} $$
where w=v−Δ
t
and \(\tilde w_{\varepsilon } = {\bar {v}}_{\varepsilon } -\Delta _{t}\). Introduce the set \({\mathcal {C}}_{M}\) such that each coordinate of \(\Delta _{t}=\left (\Delta _{t}^{(1)},\ldots,\Delta _{t}^{(d)} \right)\) is less than \(M-\sqrt {M}\), i.e., \(\left |\Delta _{t}^{(l)} \right |\leq M-\sqrt {M}\). Then
because \({\mathcal {K}}_{\sqrt {M}}\subset {\mathcal {C}}_{{M}} \). By Andersen’s Lemma (see, e.g., Ibragimov and Has’minskii (1981), Lemma 2.10.2)
$$\begin{array}{@{}rcl@{}} {\int\nolimits}_{{\mathcal{K}}_{\sqrt{M}}}^{} \left| \dot u^{*} \left(\tilde w_{\varepsilon}- w\right)\right|_{K}^{2}e^{-\frac{1}{2}\left| w \right|^{2}}\mathrm{d}v\geq {\int\nolimits}_{{\mathcal{K}}_{\sqrt{M}}}^{} \left| \dot u^{*} w\right|_{K}^{2}e^{-\frac{1}{2}\left| w \right|^{2}}\mathrm{d}v. \end{array} $$
Note that as M→∞ we obtain the limits
and
$$\begin{array}{@{}rcl@{}} \frac{1}{\left(2\pi \right)^{\frac{d}{2}}}{\int\nolimits}_{{\mathcal{K}}_{\sqrt{M}}}^{} \left| \dot u^{*} w\right|_{K}^{2}e^{-\frac{1}{2}\left| w \right|^{2}}\mathrm{d}v&\longrightarrow {\mathbf{E}}_{\vartheta_{0}} \left|\dot u^{*}\Delta_{t}\right|_{K}^{2}. \end{array} $$
The last steps are ε→0 and K→∞
$$\begin{array}{@{}rcl@{}} {\mathbf{E}}_{\vartheta_{0}} \left|\dot u^{*}\Delta_{t}\right|_{K}^{2}\longrightarrow \dot u_{\circ}\left(t,x_{t},\vartheta_{0}\right)^{*}\mathbb{I}_{t}\left(\vartheta_{0}\right)^{-1}\dot u_{\circ}\left(t,x_{t},\vartheta_{0}\right). \end{array} $$
The detailed proof can be found in Ibragimov and Has’minskii (1981), Theorem 2.12.1.
Therefore the bound (15) is verified. The bound (16) is proved in a similar way. Note that Z
t
=ε
u
x′(t,X
t
,𝜗)σ(t,X
t
). An arbitrary estimator \({\bar {Z}}_{t}\) of Z
t
we write as \({\bar {Z}}_{t}=\varepsilon \tilde Z_{t}\). Then, for \(\varepsilon ^{-1}\left ({\bar {Z}}_{t}-Z_{t}\right)\) we follow the proof given above.
Definition
Suppose that the conditions \({\mathfrak L}, {\mathfrak R}, {\mathfrak U}\) are fulfilled. Then we call the estimator-processes \( Y_{t}^{*}, Z_{t}^{*}, 0<t\leq T\) asymptotically efficient if for all 𝜗
0∈Θ and all t∈(0,T] we have the equalities
$$\begin{array}{@{}rcl@{}} && \lim\limits_{\nu \rightarrow 0}\lim\limits_{\varepsilon \rightarrow 0} \sup\limits_{\left|\vartheta -\vartheta_{0}\right|\leq \nu} \varepsilon^{-2}{\mathbf{E}}_{\vartheta} \left| Y_{t}^{*}-Y_{t}\right|^{2}= \dot u_{\circ}\left(t,x_{t},\vartheta_{0}\right)^{*}\mathbb{I}_{t}\left(\vartheta_{0}\right)^{-1}\dot u_{\circ}\left(t,x_{t},\vartheta_{0}\right), \end{array} $$
(17)
$$\begin{array}{@{}rcl@{}} &&\lim\limits_{\nu \rightarrow 0}\lim\limits_{\varepsilon \rightarrow 0} \sup\limits_{\left|\vartheta -\vartheta _{0}\right|\leq \nu} \varepsilon^{-4}{\mathbf{E}}_{\vartheta} \left| Z_{t}^{*}-Z_{t}\right|^{2} = \left|{\left(\dot u_{\circ}\right)'_{x}\left(t,x_{t},\vartheta _{0}\right)^{*}{\mathbb{I}_{t}}\left(\vartheta_{0}\right)^{-\frac{1}{2}} \sigma \left(t,x_{t}\right)}\right|^{2}. \end{array} $$
(18)
As we do not know the value 𝜗 we propose first to estimate it using some estimator-process 𝜗
ε,t⋆,0<t≤T and then to put
$$Y_{t}^{\star}=u\left(t,X_{t},\vartheta_{\varepsilon} ^{\star} \right),\qquad Z_{t}^{\star}=\varepsilon \sum\limits_{l=1}^{k} u'_{x_{l}} \left(t,X_{t},\vartheta_{\varepsilon}^{\star} \right)\sigma_{l} \left(t,X_{t}\right). $$
Recall that formally the MLE-process \({\hat {\vartheta }}_{\varepsilon,t}, 0<t\leq T \) “solves” the problem and it can be shown that under the supposed regularity conditions the estimator-processes \({\hat {Y}}_{t,\varepsilon }=u(t,X_{t},{\hat {\vartheta }}_{\varepsilon,t}) \) and \({\hat {Z}}_{t,\varepsilon }=u'_{x}(t,X_{t},{\hat {\vartheta }}_{\varepsilon,t})\sigma \left (t,X_{t}\right) \) are asymptotically efficient in the sense of the relations (17) and (18), respectively, but this solution can not be called acceptable because the calculation of \({\hat {\vartheta }}_{\varepsilon,t}\) for all t∈(0,T], in the general case, is a computationally difficult problem. That is why we propose to use the so-called multi-step MLE-process (Kutoyants 2015), which is introduced as follows. First we construct a preliminary estimator \( {\bar {\vartheta }}_{\tau _{\varepsilon } }\) by the observations \(X^{\tau _{\varepsilon } }=\left (X_{s},0\leq s\leq \tau _{\varepsilon } \right)\) on some learning interval [0,τ
ε
], where τ
ε
=ε
δ with 0<δ<1 and then we propose an estimator-process \(\vartheta _{t,\varepsilon }^{\star }, \tau _{\varepsilon } \leq t\leq T\) based on this preliminary estimator. Finally we show that the corresponding estimators, say, \(Y_{t,\varepsilon }^{\star }=u\left (t,X_{t},\vartheta _{t,\varepsilon }^{\star }\right), \tau _{\varepsilon } \leq t\leq T\) are asymptotically efficient.
As a preliminary we propose the minimum distance estimator (MDE) \( {\bar {\vartheta }}_{\tau _{\varepsilon } }\) defined by the relation
$$\begin{array}{@{}rcl@{}} \left\|X-{\hat{X}}\left({\bar{\vartheta}}_{\tau_{\varepsilon} }\right)\right\|_{\tau_{\varepsilon} }^{2}=\inf_{\vartheta \in \Theta }\left\|X-{\hat{X}}\left(\vartheta \right)\right\|_{\tau_{\varepsilon} }^{2}=\inf_{\vartheta \in \Theta }{\int\nolimits}_{0}^{\tau_{\varepsilon} }\left[X_{t}-{\hat{X}}_{t}(\vartheta)\right]^{2}\,\mathrm{d} t. \end{array} $$
Here the family of random processes \(\left \{\left ({\hat {X}}_{t}\left (\vartheta \right),0\leq t\leq \tau _{\varepsilon } \right),\vartheta \in \Theta \right \}\) is defined as follows
$$\begin{array}{@{}rcl@{}} {\hat{X}}_{t}(\vartheta)=x_{0}+{{\int\nolimits}_{0}^{t}}S\left(\vartheta,s,X_{s}\right)\mathrm{d}s,\qquad 0\leq t\leq \tau_{\varepsilon},\qquad \vartheta \in \Theta. \end{array} $$
These estimators were studied in Kutoyants (1994) in the case of fixed τ
ε
=τ and are called the trajectory fitting estimators as well, because we choose an estimator \( {\bar {\vartheta }}_{\tau _{\varepsilon } } \), which provides a trajectory \({\hat {X}}_{t}\left ({\bar {\vartheta }}_{\tau _{\varepsilon } }\right),0\leq t\leq \tau _{\varepsilon } \) closest to the observations X
t
,0≤t≤τ
ε
. It was shown that if the conditions of regularity and the condition of identifiability: for any ν>0
$$\begin{array}{@{}rcl@{}} \inf\limits_{\vartheta_{0}\in \Theta }\inf\limits_{\left|\vartheta -\vartheta_{0}\right|>\nu }{\int\nolimits}_{0}^{\tau }\left| {{\int\nolimits}_{0}^{t}} \left[S\left(\vartheta,s,x_{s}\right)- S\left(\vartheta_{0},s,x_{s}\right)\right]\mathrm{d}s\right|^{2}\mathrm{d}t>0 \end{array} $$
(19)
hold and the matrix
$$\begin{array}{@{}rcl@{}} \mathbb{J}_{\tau} \left(\vartheta_{0}\right)={\int\nolimits}_{0}^{\tau} {{\int\nolimits}_{0}^{t}}\dot{\mathbb{S}}\left(\vartheta_{0},s,x_{s} \right)^{*}\mathrm{d}s {{\int\nolimits}_{0}^{t}}\dot{\mathbb{S}}\left(\vartheta_{0},s,x_{s} \right)\mathrm{d}s\,\mathrm{d}t \end{array} $$
is uniformly nondegenerate (below \(\lambda \in {\mathcal {R}}^{d}\))
$$\begin{array}{@{}rcl@{}} \inf\limits_{\vartheta_{0}\in \Theta }\inf\limits_{\left|\lambda \right|=1}\lambda^{*}\mathbb{J}_{\tau} \left(\vartheta_{0}\right)\lambda >0, \end{array} $$
(20)
then the MDE is asymptotically normal
$$\begin{array}{@{}rcl@{}} \varepsilon^{-1}\left(\vartheta_{\tau} -\vartheta_{0}\right)\Longrightarrow \mathbb{J}_{\tau} \left(\vartheta_{0}\right)^{-1} {\int\nolimits}_{0}^{\tau} {{\int\nolimits}_{0}^{t}}\dot{\mathbb{S}}\left(\vartheta_{0},s,x_{s} \right)^{*}\mathrm{d}s\;{{\int\nolimits}_{0}^{t}}\sigma \left(s,x_{s} \right)^{*}\mathrm{d}W_{s}\;\mathrm{d}t. \end{array} $$
Note that if we have the Regularity condition 3 (identifiability) with T=τ, then the identifiability condition (19) is also fulfilled. Indeed, suppose that there exists 𝜗
1≠𝜗
0 such that
$$\begin{array}{@{}rcl@{}} {\int\nolimits}_{0}^{\tau }\left| {{\int\nolimits}_{0}^{t}} \left[S\left(\vartheta_{1},s,x_{s}\right)- S\left(\vartheta_{0},s,x_{s}\right)\right]\mathrm{d}s\right|^{2}\mathrm{d}t=0. \end{array} $$
Then for all t∈[0,τ]
$$\begin{array}{@{}rcl@{}} {{\int\nolimits}_{0}^{t}} S\left(\vartheta_{1},s,x_{s}\right)\mathrm{d}s= {{\int\nolimits}_{0}^{t}}S\left(\vartheta_{0},s,x_{s}\right)\mathrm{d}s, \end{array} $$
which implies
$$\begin{array}{@{}rcl@{}} S\left(\vartheta_{1},s,x_{s}\right)= S\left(\vartheta_{0},s,x_{s}\right),\qquad 0\leq s\leq \tau. \end{array} $$
The last equality, of course, contradicts Regularity condition 3.
Now suppose that τ
ε
=ε
δ with δ<1 and the matrix
$$\begin{array}{@{}rcl@{}} \mathbb{C}\left(\vartheta_{0}\right)=\dot{\mathbb{S}}\left(\vartheta_{0},0,x_{0} \right)^{*}\dot{\mathbb{S}}\left(\vartheta_{0},0,x_{0} \right) \end{array} $$
is uniformly nondegenerate in 𝜗
0∈Θ (below \(\lambda \in {\mathcal {R}}^{d}\))
$$\begin{array}{@{}rcl@{}} \inf\limits_{\vartheta \in\Theta }\inf\limits_{\left|\lambda \right|=1}\lambda^{*}\mathbb{C}\left(\vartheta_{0}\right)\lambda >0 \end{array} $$
(21)
Then, we can obtain the asymptotics
$$\begin{array}{@{}rcl@{}} \varepsilon^{-1}\left(\vartheta_{\tau} -\vartheta_{0}\right)=\frac{3}{2 \sqrt{\tau_{\varepsilon} }} \mathbb{C}\left(\vartheta_{0}\right)^{-1}\sigma \left(0,x_{0}\right){{\int\nolimits}_{0}^{1}}\left[1-r^{2}\right]\mathrm{d}W(r)\left(1+o(1)\right). \end{array} $$
Note that
$$\begin{array}{@{}rcl@{}} \mathbb{J}_{\tau_{\varepsilon} }\left(\vartheta_{0}\right)&=&{\int\nolimits}_{0}^{\tau_{\varepsilon} }t^{2} \dot{\mathbb{S}}\left(\vartheta_{0},0,x_{0} \right)^{*}\dot{\mathbb{S}}\left(\vartheta_{0},0,x_{0} \right)\mathrm{d}t \left(1+o(1)\right)\\ &=&\frac{\tau_{\varepsilon} ^{3}}{3} \dot{\mathbb{S}}\left(\vartheta_{0},0,x_{0} \right)^{*}\dot{\mathbb{S}}\left(\vartheta_{0},0,x_{0} \right) \left(1+o(1)\right) \end{array} $$
and
$$\begin{array}{@{}rcl@{}} &&{\int\nolimits}_{0}^{\tau_{\varepsilon}} {{\int\nolimits}_{0}^{t}}\dot{\mathbb{S}}\left(\vartheta_{0},s,x_{s} \right)^{*}\mathrm{d}s\;{{\int\nolimits}_{0}^{t}}\sigma \left(s,x_{s} \right)^{*}\mathrm{d}W_{s}\;\mathrm{d}t\\ &&\qquad \quad =\dot{\mathbb{S}}\left(\vartheta_{0},0,x_{0} \right)^{*}\sigma \left(0,x_{0}\right){\int\nolimits}_{0}^{\tau_{\varepsilon}} t W_{t}\;\mathrm{d}t\left(1+o(1)\right)\\ &&\qquad \quad =\frac{\tau_{\varepsilon} ^{\frac{5}{2}}}{2}\dot{\mathbb{S}}\left(\vartheta_{0},0,x_{0} \right)^{*}\sigma \left(0,x_{0}\right){{\int\nolimits}_{0}^{1}}\left(1-r^{2}\right)\; \mathrm{d}W(r)\left(1+o(1)\right). \end{array} $$
Therefore, the family of random vectors \(\varepsilon ^{-1+\frac {\delta }{2}}\left ({\bar {\vartheta }}_{\tau _{\varepsilon } }-\vartheta _{0}\right) \) is asymptotically normal. Moreover, following Kutoyants (1994) it can be shown that the moments are bounded, i.e.,
$$\begin{array}{@{}rcl@{}} \sup\limits_{\vartheta_{0}\in \textbf{K}}{\mathbf{E}}_{\vartheta_{0}}\left| \varepsilon^{-1+\frac{\delta }{2}}\left({\bar{\vartheta}}_{\tau_{\varepsilon} }-\vartheta_{0}\right) \right|^{p}<C, \end{array} $$
(22)
where the constant C=C(p)>0 does not depend on ε for all p>0.
Let us introduce the one-step MLE-process \(\vartheta _{t,\varepsilon }^{\star },\tau _{\varepsilon } \leq t\leq T \)
$$\begin{array}{@{}rcl@{}} &&\vartheta_{t,\varepsilon }^{\star}={\bar{\vartheta}}_{\tau_{\varepsilon} }\\ &&\quad +\mathbb{I}_{t}\left({\bar{\vartheta}}_{\tau_{\varepsilon} }\right)^{-1}{\int\nolimits}_{\tau _{\varepsilon} }^{t}\dot{\mathbb{S}}\left({\bar{\vartheta}}_{\tau_{\varepsilon} },s,X_{s}\right)^{*}\mathbb{A}\left(s,X_{s}\right)^{-1}\left[\mathrm{d}X_{s}-S\left({\bar{\vartheta}}_{\tau_{\varepsilon} },s,X_{s}\right)\mathrm{d}s\right]. \end{array} $$
(23)
Its properties are described in the following proposition.
Proposition 1
Let the conditions \( {\mathfrak L}, {\mathfrak R}\) be fulfilled and δ∈(0,1), then for all t∈(0,T]
$$\begin{array}{@{}rcl@{}} \varepsilon^{-1}\left(\vartheta_{t,\varepsilon }^{\star}-\vartheta_{0} \right)\Rightarrow {\mathcal{N}}\left(0,\mathbb{I}_{t}\left(\vartheta_{0 }\right)^{-1} \right) \end{array} $$
and this estimator-process is asymptotically efficient. Moreover, we have the uniform consistency, i.e., for any ν>0
$$\begin{array}{@{}rcl@{}} \lim\limits_{\varepsilon \rightarrow 0}\sup\limits_{\vartheta_{0}\in \textbf{K}}\mathbf{P}_{\vartheta _{0}}^{(\varepsilon)}\left(\sup\limits_{\tau_{\varepsilon} \leq t\leq T}\left|\vartheta_{t,\varepsilon }^{\star}-\vartheta_{0} \right|>\nu \right)=0. \end{array} $$
Proof
Note that the estimator \(\vartheta _{t,\varepsilon }^{\star } \) is defined for t∈[τ
ε
,T], but as τ
ε
→0 we obtain for any positive t the relation t>τ
ε
. □
The substitution of the observations (9) provides us the equality
$$\begin{array}{@{}rcl@{}} &&\vartheta_{t,\varepsilon }^{\star}-\vartheta_{0}={\bar{\vartheta}}_{\tau_{\varepsilon} }-\vartheta_{0}+\varepsilon \mathbb{I}_{t}\left({\bar{\vartheta}}_{\tau_{\varepsilon} }\right)^{-1}{\int\nolimits}_{\tau _{\varepsilon} }^{t}\dot{\mathbb{S}}\left({\bar{\vartheta}}_{\tau_{\varepsilon} },s,X_{s}\right)^{*}\sigma \left(s,X_{s}\right)^{-1}\mathrm{d}W_{s}\\ &&\; +\mathbb{I}_{t}\left({\bar{\vartheta}}_{\tau_{\varepsilon} }\right)^{-1}{\int\nolimits}_{\tau _{\varepsilon} }^{t}\dot{\mathbb{S}}\left({\bar{\vartheta}}_{\tau_{\varepsilon} },s,X_{s}\right)^{*}\mathbb{A}\left(s,X_{s}\right)^{-1}\left[S\left(\vartheta_{0 },s,X_{s}\right) -S\left({\bar{\vartheta}}_{\tau_{\varepsilon} },s,X_{s}\right) \right] \mathrm{d}s. \end{array} $$
Recall that the vector-process (X
s
,0≤s≤T) converges uniformly in s to the deterministic vector-function (x
s
,0≤s≤T) and the estimator \({\bar {\vartheta }}_{\tau _{\varepsilon }} \) is consistent. Therefore, we have the convergence in probability
$$\begin{array}{@{}rcl@{}} &&\mathbb{I}_{t}\left({\bar{\vartheta}}_{\tau_{\varepsilon} }\right)^{-1}{\int\nolimits}_{\tau _{\varepsilon} }^{t}\dot{\mathbb{S}}\left({\bar{\vartheta}}_{\tau_{\varepsilon} },s,X_{s}\right)^{*}\sigma \left(s,X_{s}\right)^{-1}\mathrm{d}W_{s}\\ &&\qquad \quad \longrightarrow \mathbb{I}_{t}\left(\vartheta _{0}\right)^{-1}{{\int\nolimits}_{0}^{t}}\dot{\mathbb{S}}\left(\vartheta_{0},s,x_{s}\right)^{*}\sigma \left(s,x_{s}\right)^{-1}\mathrm{d}W_{s}\sim {\mathcal{N}}\left(0,\mathbb{I}_{t}\left(\vartheta _{0 }\right)^{-1} \right). \end{array} $$
For the other terms, we first write the Taylor expansion
$$\begin{array}{@{}rcl@{}} S\left(\vartheta_{0 },s,X_{s}\right) -S\left({\bar{\vartheta}}_{\tau_{\varepsilon} },s,X_{s}\right)=\dot{\mathbb{S}}\left(\vartheta_{0},s,X_{s}\right)^{*}\left(\vartheta_{0}- {\bar{\vartheta}}_{\tau_{\varepsilon} }\right)+ O\left(\varepsilon^{2-\delta }\right) \end{array} $$
because \(\vartheta _{0}- {\bar {\vartheta }}_{\tau _{\varepsilon } }=O\left (\varepsilon ^{1-\frac {\delta }{2}}\right) \). Then, we denote
$$\begin{array}{@{}rcl@{}} \mathbb{D}_{\varepsilon} =\mathbb{I}_{t}\left({\bar{\vartheta}}_{\tau_{\varepsilon} }\right)-{\int\nolimits}_{\tau_{\varepsilon} }^{t}\dot{\mathbb{S}}\left({\bar{\vartheta}}_{\tau_{\varepsilon} },s,X_{s}\right)^{*}\mathbb{A} \left(s,X_{s}\right)^{-1}\dot{\mathbb{S}}\left(\vartheta_{0},s,X_{s}\right)^{*}\mathrm{d}s \end{array} $$
and write
$$\begin{array}{@{}rcl@{}} && {\bar{\vartheta}}_{\tau_{\varepsilon} }-\vartheta_{0} \,+\,\mathbb{I}_{t}\!\left({\bar{\vartheta}}_{\tau_{\varepsilon} }\right)^{-1}\!\!{\int\nolimits}_{\tau _{\varepsilon} }^{t}\!\dot{\mathbb{S}}\!\left({\bar{\vartheta}}_{\tau_{\varepsilon} },s,X_{s}\right)^{*}\!\mathbb{A}\left(s,X_{s}\right)^{-1}\left[S\left(\vartheta_{0 },s,X_{s}\right) -S\left({\bar{\vartheta}}_{\tau_{\varepsilon} },s,X_{s}\right) \right] \mathrm{d}s\\ &&\quad =\left({\bar{\vartheta}}_{\tau_{\varepsilon} }-\vartheta_{0}\right)\mathbb{I}_{t}\left({\bar{\vartheta}}_{\tau_{\varepsilon} }\right)^{-1}\mathbb{D}_{\varepsilon} +O\left(\varepsilon^{2-\delta }\right) \end{array} $$
The following estimate can be easily verified
$$\begin{array}{@{}rcl@{}} \mathbb{D}_{\varepsilon}=O\left(\varepsilon^{\delta} \right)+O\left(\varepsilon^{1-\frac{\delta }{2}}\right)+O(\varepsilon) \end{array} $$
because X
s
−x
s
=O(ε), \({\bar {\vartheta }}_{\tau _{\varepsilon } }-\vartheta _{0}= O\left (\varepsilon ^{1-\frac {\delta }{2}}\right)\) and
$$\begin{array}{@{}rcl@{}} \mathbb{I}_{t}\left(\vartheta_{0}\right){-}\int_{\tau_{\varepsilon} }^{t}\dot{\mathbb{S}}\left(\vartheta_{0},s,x_{s}\right)^{*}\mathbb{A}\left(s,x_{s}\right)^{-1}\dot{\mathbb{S}}\left(\vartheta_{0},s,x_{s}\right) \mathrm{d}s=O\left({\varepsilon^{\delta} }\right). \end{array} $$
Hence
$$\begin{array}{@{}rcl@{}} &&\varepsilon^{-1}\left(\vartheta_{t,\varepsilon }^{\star}-\vartheta_{0}\right)-\mathbb{I}_{t}\left(\vartheta_{0}\right)^{-1}{{\int\nolimits}_{0}^{t}}\dot{\mathbb{S}}\left(\vartheta_{0},s,x_{s}\right)^{*}\sigma \left(s,x_{s}\right)^{-1}\mathrm{d}W_{s}\\ &&\qquad \qquad =\varepsilon^{-1}\left({\bar{\vartheta}}_{\tau_{\varepsilon} }-\vartheta_{0} \right)O\left(\varepsilon^{1-\frac{\delta }{2}}\right)=O\left(\varepsilon^{1-\delta }\right) \longrightarrow 0. \end{array} $$
(24)
The uniform consistency can be shown following the proof of such uniform consistency presented in Kutoyants (2015), Theorem 1.
Let us define the estimator-processes \(Y^{\star }_{\varepsilon } =\left (Y_{t,\varepsilon }^{\star },\tau _{\varepsilon } \leq t\leq T \right)\) and \(Z^{\star }_{\varepsilon } =\left (Z_{t,\varepsilon }^{\star },\tau _{\varepsilon } \leq t\leq T \right)\) as follows
$$\begin{array}{@{}rcl@{}} Y_{t,\varepsilon }^{\star}=u\left(t,X_{t},\vartheta_{t,\varepsilon }^{\star}\right),\qquad \quad Z_{t,\varepsilon }^{\star}=\varepsilon u'_{x}\left(t,X_{t},\vartheta_{t,\varepsilon }^{\star}\right)\sigma \left(t,X_{t}\right). \end{array} $$
Theorem 2
Suppose the conditions \({\mathfrak L},{\mathfrak R},{\mathfrak U}\) and (21) hold, then the esti-mator-processes \(Y_{\varepsilon }^{\star },Z_{\varepsilon }^{\star }\) admit the representations
$$\begin{array}{@{}rcl@{}} Y_{t,\varepsilon }^{\star}&=&Y_{t}+\varepsilon \dot u_{\circ}\left(t,x_{t},\vartheta_{0}\right)^{*}\,\xi_{t}\left(\vartheta_{0}\right)\left(1+o(1) \right), \end{array} $$
(25)
$$\begin{array}{@{}rcl@{}} Z_{t,\varepsilon }^{\star}&=&Z_{t}+\varepsilon^{2}\dot u'_{\circ,x}\left(t,x_{t},\vartheta_{0}\right)^{*}\,\xi_{t}\left(\vartheta_{0}\right)\sigma \left(t,x_{t}\right)\left(1+o(1)\right), \end{array} $$
(26)
where the Gaussian process
$$\xi_{t}\left(\vartheta_{0}\right)={\mathbb{I}_{t}} \left(\vartheta_{0}\right)^{-1} {{\int\nolimits}_{0}^{t}}{\dot{\mathbb{S}}\left(\vartheta_{0},s,x_{s}\right)^{*}}{\sigma \left(s,x_{s}\right)^{-1}}\mathrm{d}W_{s},\qquad \tau _{\varepsilon} \leq t\leq T. $$
The random processes
$$\begin{array}{@{}rcl@{}} \eta_{t,\varepsilon }&=&\varepsilon^{-1} \left(Y_{t,\varepsilon }^{\star}-Y_{t}\right),\tau \leq t\leq T,\\ \zeta_{t,\varepsilon }&=&\varepsilon^{-2} \left(Z_{t,\varepsilon }^{\star}-Z_{t}\right),\tau \leq t\leq T \end{array} $$
for any τ∈(0,T] converge in probability to the processes
$$\begin{array}{@{}rcl@{}} \eta_{t}&=&\dot u_{\circ}\left(t,x_{t},\vartheta_{0}\right)^{*}\,\xi_{t}\left(\vartheta_{0}\right),\qquad \tau \leq t\leq T,\\ \zeta_{t}&=&\dot u'_{\circ,x}\left(t,x_{t},\vartheta_{0}\right)^{*}\,\xi_{t}\left(\vartheta_{0}\right)\sigma \left(t,x_{t}\right),\qquad \tau \leq t\leq T, \end{array} $$
respectively, uniformly in t∈[τ,T]. Moreover, these approximations are asymptotically efficient in the sense of (17), (18).
Proof
By the condition \({\mathfrak U}\), we obtain the representation
$$\begin{array}{@{}rcl@{}} Y_{t,\varepsilon }^{\star}-Y_{t}&=&u\left(t,X_{t},\vartheta_{t,\varepsilon }^{\star} \right)-u\left(t,X_{t},\vartheta_{0} \right)=\dot u(t,X_{t}, \vartheta_{0})^{*}\left(\vartheta_{t,\varepsilon }^{\star}-\vartheta_{0} \right)\left(1+o(1)\right),\\ Z_{t,\varepsilon}^{\star}-Z_{t}&=&\varepsilon \left[u'_{x}\left(t,X_{t},\vartheta_{t,\varepsilon }^{\star} \right)-u'_{x}\left(t,X_{t},\vartheta_{0} \right)\right]\sigma \left(t,X_{t}\right)\\ &=&\varepsilon\dot u'_{x}\left(t,X_{t}, \vartheta_{0}\right)^{*}\left(\vartheta_{t,\varepsilon }^{\star}-\vartheta_{0} \right)\sigma \left(t,X_{t}\right)\left(1+o(1)\right), \end{array} $$
and for any τ∈(0,T] we have the convergence in probability
$$\begin{array}{@{}rcl@{}} &&\sup\limits_{\tau \leq t\leq T}\left|\dot u(t,X_{t},\vartheta_{0})-\dot u_{\circ}(t,x_{t}, \vartheta_{0}) \right|\leq \sup\limits_{\tau \leq t\leq T}\left|\dot u(t,X_{t},\vartheta_{0})-\dot u(t,x_{t}, \vartheta_{0}) \right|\\ &&\qquad \qquad +\sup\limits_{\tau \leq t\leq T}\left|\dot u(t,x_{t},\vartheta_{0})-\dot u_{\circ}(t,x_{t}, \vartheta_{0}) \right|\longrightarrow 0, \end{array} $$
(27)
$$\begin{array}{@{}rcl@{}} &&\sup\limits_{\tau \leq t\leq T}\left|\dot u'_{x}(t,X_{t},\vartheta_{0})-\dot u'_{x}(t,x_{t}, \vartheta_{0}) \right|\leq \sup\limits_{\tau \leq t\leq T}\left|\dot u'_{x}(t,X_{t},\vartheta_{0})-\dot u'_{x}(t,x_{t}, \vartheta_{0}) \right|\\ &&\qquad \qquad +\sup\limits_{\tau \leq t\leq T}\left|\dot u'_{x}(t,x_{t},\vartheta_{0})-\dot u'_{\circ,x}(t,x_{t}, \vartheta_{0}) \right|\longrightarrow 0. \end{array} $$
(28)
□
Therefore, the representations (25),(26) follow now from (24).
More detailed analysis shows that the convergences O(1) in (24),(25) are uniform in t∈[τ,T] due to (11). Moreover, we have the convergence of moments uniform on compacts 𝜗
0∈K as well, because we have (12) and the moments of the preliminary estimator are bounded (22). Therefore, the estimates used above can be also written for the moments. This convergence of moments provides the asymptotic efficiency of the estimators \(Y^{\star }_{\varepsilon },Z^{\star }_{\varepsilon } \).
The estimators \(Y^{\star }_{t,\varepsilon },Z^{\star }_{t,\varepsilon },\tau _{\varepsilon } \leq t\leq T \) are given for the values t>τ
ε
=ε
δ with δ∈(0,1). It is interesting to have a shorter learning interval and, therefore, longer estimation period for Y
t
,Z
t
. That is why we propose the two-step MLE-process which uses the preliminary estimator with the worse rate of convergence. Let us take \(\delta \in [1,\frac {4}{3})\), introduce the second preliminary estimator-process
$$\begin{array}{@{}rcl@{}} {\bar{\vartheta}}_{t,\varepsilon }={\bar{\vartheta}}_{\tau_{\varepsilon} }+\mathbb{I}_{t}\left({\bar{\vartheta}}_{\tau_{\varepsilon} }\right)^{-1}\int_{\tau_{\varepsilon} }^{t}\dot{\mathbb{S}}\left({\bar{\vartheta}}_{\tau_{\varepsilon} },s,X_{s}\right)^{*}\mathbb{A}\left(s,X_{s}\right)^{-1}\left[\mathrm{d}X_{s}-S\left({\bar{\vartheta}}_{\tau_{\varepsilon} },s,X_{s}\right)\mathrm{d}s\right] \end{array} $$
and the two-step MLE-process \(\vartheta _{t,\varepsilon }^{\star \star },\tau _{\varepsilon } \leq t\leq T \)
$$\begin{array}{@{}rcl@{}} \vartheta_{t,\varepsilon }^{\star\star}&={\bar{\vartheta}}_{t,\varepsilon } +\mathbb{I}_{t}\left({\bar{\vartheta}}_{t,\varepsilon}\right)^{-1}\int_{\tau_{\varepsilon} }^{t}\dot{\mathbb{S}}\left({\bar{\vartheta}}_{\tau_{\varepsilon} },s,X_{s}\right)^{*}\mathbb{A}\left(s,X_{s}\right)^{-1}\left[\mathrm{d}X_{s}-S\left({\bar{\vartheta}}_{t,\varepsilon },s,X_{s}\right)\mathrm{d}s\right]. \end{array} $$
For the preliminary estimator we obtain the same estimate (22), but with different τ
ε
. Further, for the first preliminary estimator similar calculations as above provide us the estimates
$$\begin{array}{@{}rcl@{}} \varepsilon^{-\gamma }\left({\bar{\vartheta}}_{t,\varepsilon }-\vartheta_{0}\right)=\varepsilon^{-\gamma }\left|{\bar{\vartheta}}_{\tau_{\varepsilon} }-\vartheta_{0} \right|^{2}O(1)+o(1)=\varepsilon^{-\gamma+2-\delta }O(1) +o(1). \end{array} $$
For the two-step MLE-process we have
$$\begin{array}{@{}rcl@{}} &&\varepsilon^{-1}\left(\vartheta_{t,\varepsilon }^{\star\star}-\vartheta_{0}\right)=\varepsilon^{\gamma -\frac{\delta }{2}} \left(\varepsilon^{-\gamma }\left|{\bar{\vartheta}}_{t,\varepsilon }-\vartheta_{0}\right|\right)\;(\varepsilon^{-1+\frac{\delta }{2} }\left|{\bar{\vartheta}}_{\tau_{\varepsilon} }-\vartheta_{0} \right|)O(1)\\ &&\qquad \qquad \qquad\quad +\mathbb{I}_{t}\left(\vartheta_{0 }\right)^{-1}{\int\nolimits}_{\tau_{\varepsilon} }^{t}\dot{\mathbb{S}}\left(\vartheta_{0 },s,x_{s}\right)^{*}\mathbb{A}\left(s,x_{s}\right)^{-1}\mathrm{d}W_{s}+o(1). \end{array} $$
Therefore if we take γ such that γ+δ<2 and \(\gamma -\frac {\delta }{2}>0\), say, \(\gamma <\frac {2}{3}\), then we obtain
$$\begin{array}{@{}rcl@{}} \varepsilon^{-1}\left(\vartheta_{t,\varepsilon }^{\star\star}-\vartheta_{0}\right)\Longrightarrow {\mathcal{N}}\left(0,\mathbb{I}_{t}\left(\vartheta_{0 }\right)^{-1} \right). \end{array} $$
Now the estimator-processes \(Y^{\star \star }_{\varepsilon },Z^{\star \star }_{\varepsilon }\) defined with the help of two-step MLE-process
$$\begin{array}{@{}rcl@{}} Y_{t,\varepsilon }^{\star\star}&=&u\left(t,X_{t},\vartheta_{t,\varepsilon }^{\star\star}\right),\qquad \tau_{\varepsilon} \leq t\leq T,\\ Z_{t,\varepsilon }^{\star\star}&=&u'_{x}\left(t,X_{t},\vartheta_{t,\varepsilon }^{\star\star}\right)\sigma \left(t,X_{t}\right),\qquad \tau_{\varepsilon} \leq t\leq T \end{array} $$
are known for the larger time interval [τ
ε
,T].
Of course, we can continue this process and to reduce the learning interval once more by introducing the three-step MLE-process \(\vartheta _{t,\varepsilon }^{\star \star \star }\) as follows. The learning interval is [0,τ
ε
], τ
ε
=ε
δ, where \(\delta \in [\frac {4}{3}, \frac {3}{2})\). The first preliminary estimator-process is
$$\begin{array}{@{}rcl@{}} {\bar{\vartheta}}_{t,\varepsilon }&={\bar{\vartheta}}_{\tau_{\varepsilon} }+\mathbb{I}_{t}\left({\bar{\vartheta}}_{\tau_{\varepsilon} }\right)^{-1}{\int\nolimits}_{\tau _{\varepsilon} }^{t}\dot{\mathbb{S}}\left({\bar{\vartheta}}_{\tau_{\varepsilon} },s,X_{s}\right)^{*}\mathbb{A}\left(s,X_{s}\right)^{-1}\left[\mathrm{d}X_{s}-S\left({\bar{\vartheta}}_{\tau_{\varepsilon} },s,X_{s}\right)\mathrm{d}s\right], \end{array} $$
the second is
$$\begin{array}{@{}rcl@{}} \bar{{\bar{\vartheta}}}_{t,\varepsilon }&={\bar{\vartheta}}_{t,\varepsilon }+\mathbb{I}_{t}\left({\bar{\vartheta}}_{t,\varepsilon }\right)^{-1}{\int\nolimits}_{\tau _{\varepsilon} }^{t}\dot{\mathbb{S}}\left({\bar{\vartheta}}_{\tau_{\varepsilon} },s,X_{s}\right)^{*}\mathbb{A}\left(s,X_{s}\right)^{-1}\left[\mathrm{d}X_{s}-S\left({\bar{\vartheta}}_{t,\varepsilon },s,X_{s}\right)\mathrm{d}s\right], \end{array} $$
and the three-step MLE-process
$$\begin{array}{@{}rcl@{}} {\vartheta}_{t,\varepsilon }^{\star\star\star}&=\bar{{\bar{\vartheta}}}_{t,\varepsilon }+\mathbb{I}_{t}(\bar{{\bar{\vartheta}}}_{t,\varepsilon })^{-1}\int_{\tau _{\varepsilon} }^{t}\dot{\mathbb{S}}\left({\bar{\vartheta}}_{\tau_{\varepsilon} },s,X_{s}\right)^{*}\mathbb{A}\left(s,X_{s}\right)^{-1}\left[\mathrm{d}X_{s}-S(\bar{{\bar{\vartheta}}}_{t,\varepsilon },s,X_{s})\mathrm{d}s\right]. \end{array} $$
The similar calculations will provide us the relations : \({\bar {\vartheta }}_{\tau _{\varepsilon } }-\vartheta _{0}=\varepsilon ^{1-\frac {\delta }{2}}O(1) \),
$$\begin{array}{@{}rcl@{}} \varepsilon^{-\gamma_{1}}\left({\bar{\vartheta}}_{t,\varepsilon }-\vartheta_{0}\right)=\varepsilon^{2-\gamma_{1}-{\delta }} O(1),\qquad \quad \varepsilon^{-\gamma_{2}}\left(\bar{{\bar{\vartheta}}}_{t,\varepsilon }-\vartheta_{0}\right)=\varepsilon^{-\gamma_{2}+\gamma_{1}+1-\frac{\delta }{2}} O(1), \end{array} $$
and
$$\begin{array}{@{}rcl@{}} &&\varepsilon^{-1}\left(\vartheta_{t,\varepsilon }^{\star\star\star}-\vartheta_{0}\right)=\varepsilon^{\gamma_{2} -\frac{\delta }{2}} \left(\varepsilon^{-\gamma_{2} }\left|\bar{{\bar{\vartheta}}}_{t,\varepsilon }-\vartheta_{0}\right|\right)\;(\varepsilon^{-1+\frac{\delta }{2} }\left|{\bar{\vartheta}}_{\tau_{\varepsilon} }-\vartheta_{0} \right|)O(1)\\ &&\qquad \qquad \qquad +\mathbb{I}_{t}\left(\vartheta_{0 }\right)^{-1}{\int\nolimits}_{\tau_{\varepsilon} }^{t}\dot{\mathbb{S}}\left(\vartheta_{0 },s,x_{s}\right)^{*}\mathbb{A}\left(s,x_{s}\right)^{-1}\mathrm{d}W_{s}+o(1). \end{array} $$
Hence if we chose δ, γ
1 and γ
2 such that
$$\begin{array}{@{}rcl@{}} \delta <2,\qquad \gamma_{1}+\delta <2,\qquad \gamma_{1}-\gamma_{2}+1-\frac{\delta }{2}>0,\quad \gamma_{2}>\frac{\delta }{2}, \end{array} $$
then once more we obtain asymptotically efficient MLE-process
$$\begin{array}{@{}rcl@{}} \varepsilon^{-1}\left(\vartheta_{t,\varepsilon }^{\star\star\star}-\vartheta_{0}\right)\Longrightarrow {\mathcal{N}}\left(0,\mathbb{I}_{t}\left(\vartheta_{0 }\right)^{-1} \right). \end{array} $$
Therefore, we obtain the corresponding approximations \(Y_{t,\varepsilon }^{\star \star \star },Z_{t,\varepsilon }^{\star \star \star }\) for the values t∈[τ
ε
,T] with essentially smaller τ
ε
than in the case of one-step MLE-process.