Skip to main content

Moderate deviation for maximum likelihood estimators from single server queues


Consider a single server queueing model which is observed over a continuous time interval (0,T], where T is determined by a suitable stopping rule. Let θ be the unknown parameter for the arrival process and \(\hat {\theta }_{T}\) be the maximum likelihood estimator of θ. The main goal of this paper is to obtain a moderate deviation result of the maximum likelihood estimator for the single server queueing model under certain regular conditions.


Statistical analysis on queueing theory has come a long way in the past sixty years. A key component for the estimation of queueing parameter is maximum likelihood estimation. The problem of estimation of the unknown parameter using maximum likelihood estimation has been discussed in the literature over the last several years. The first theoretical treatment of the estimation problem was given by Clarke (1957), who derived maximum likelihood estimates of arrival and service rates in an M/M/1 queueing system. Billingsley’s (1961) treatment of inference in Markov processes in general and Wolff’s (1965) derivation of likelihood ratio tests and maximum likelihood estimates for queues that can be modeled as birth and death processes are other significant advances in this area. The papers by Cox (1965) and Goyal and Harris (1972) are worth mentioning. Since then. significant progress has occurred in adapting statistical procedures to various systems. Basawa and Prabhu (1981, 1988) and Acharya (1999) studied the asymptotic inference for single server queues, proving the consistency and asymptotic normality and finding the rate of convergence of maximum likelihood estimators in the queue GI/G/1, respectively. Recently, Acharya and Singh (2019) studied the asymptotic properties of the maximum likelihood estimator from single server queues using the martingale technique. Singh and Acharya (2019) discussed the bound for the equivalence of the Bayes and maximum likelihood estimator and also obtained the bound on the difference between the Bayes estimator from their true values of arrival and service rate parameter in an M/M/1 queue.

There has been recent interest to study the rate of convergence of the maximum likelihood estimator. Gao (2001) proved the results on moderate deviations for the maximum likelihood estimator for the case of independent and identically distributed observations and Xiao and Liu (2006) for the case of independent but not identically distributed observations. Miao and Chen (2010) gave a simpler proof to obtain these results under weaker conditions using Gärtner—Ellis theorem (cf. Dembo and Zeitouni (1998), Theorem 2.3.6, page 44). Miao and Wang (2014) improved the result in Miao and Chen (2010) by weakening the exponential integrability condition.

Our main aim in this paper is to study the problem of moderate deviations for the maximum likelihood estimator for single server GI/G/1 queueing model. Section 2 discusses the model of our interest and some elements of the maximum likelihood estimator. The main results are given in Section 3. In Section 4, we provide examples to illustrate our results.

GI/G/1 queueing model

Consider a single server queueing system in which the interarrival times {uk,k≥1} and the service times {vk,k≥1} are two independent sequences of independent and identically distributed nonnegative random variables with densities f(u;θ) and g(v;ϕ), respectively, where θ and ϕ are unknown parameters. Let us assume that f and g belong to the continuous exponential families given by

$$ f(u; \theta)= a_{1}(u) \text{exp} \{\theta h_{1}(u)- k_{1}(\theta)\}, $$
$$ g(v; \phi) = a_{2}(v) \text{exp}\{\phi h_{2}(v)- k_{2}(\phi)\}. $$


$$f(u; \theta)= g(v; \phi)=0 \quad \text{on} \quad (-\infty, 0), $$

where Θ1={θ>0: k1(θ)<} and Θ2={ϕ>0: k2(ϕ)<} are open subsets of \(\mathbb {R}\). It is easy to see that, \(k_{1}^{\prime }(\theta)=E(h_{1}(u)), \sigma _{1}^{2}=\sigma ^{2}(\theta)=k_{1}^{\prime \prime }(\theta)={var}_{\theta }(h_{1}(u)), k_{2}^{\prime }(\phi)=E(h_{2}(v)), \sigma _{2}^{2}=\sigma _{2}^{2}(\phi)=k_{2}^{\prime \prime }(\phi)={var}_{\phi }(h_{2}(v))\), respectively, the means and variances of the interarrival time and service time which are supposed to be finite. Here, dashes denotes the derivative with respect to parameters.

For simplicity, we assume that the initial customer arrives at time t=0. Our sampling scheme is to observe the system over a continuous time interval [0,T], where T is a suitable stopping time. The sample data consists of

$$ \{A(T), D(T), u_{1}, u_{2},\cdots \cdots, u_{A(T)}, v_{1}, v_{2},\cdots \cdots, v_{D(T)} \}, $$

where A(T) is the number of arrivals and D(T) is the number of departures during (0,T]. Obviously, no arrivals occur during \(\left [\sum _{i=1}^{A(T)} u_{i}, T\right ]\) and no departures during \(\left [\gamma (T)+\sum _{i=1}^{D(T)}v_{i}, T\right ]\), where γ(T) is the total idle period in (0,T].

Some possible stopping rules to determine T are given below:

  1. Rule 1.

    Observe the system until a fixed time t. Here, T=t with probability one and A(T) and D(T) are both random variables.

  2. Rule 2.

    Observe the system until d departures have occurred so that D(T)=d. Here, T=γ(T)+v1+v2++vd and A(T) are random variables.

  3. Rule 3.

    Observe the system until m arrivals take place so that A(T)=m. Here, T=u1+u2+u3++um and D(T) are random variables.

  4. Rule 4.

    Stop at the nth transition epoch. Here, T,A(T) and D(T) are all random variables and A(T)+D(T)=n.

Under rule 4, we stop either with an arrival or in a departure. If we stop with an arrival, then \(\sum _{i=1}^{A(T)} u_{i}=T\) and no departures during \(\left [\gamma (T)+\sum _{i=1}^{D(T)}v_{i}, T\right ]\). Similarly, if we stop in a departure, then \(\gamma (T)+\sum _{i=1}^{D(T)}v_{i}=T\) and there are no arrivals during \(\left [\sum _{i=1}^{A(T)} u_{i}, T\right ]\).

The likelihood function based on data (3) is given by

$$ \begin{aligned} L_{T}(\theta, \phi) & = \prod_{i=1}^{A(T)} f(u_{i},\theta)\prod_{i=1}^{D(T)} g(v_{i},\phi)\\ & \times\left[1-F_{\theta}[T-\sum_{i=1}^{A(T)} u_{i}]\right]\left[1-G_{\phi}[T-\gamma(T)-\sum_{i=1}^{D(T)}v_{i}]\right], \end{aligned} $$

where F and G are distribution functions corresponding to the densities f and g, respectively. The likelihood function LT(θ,ϕ) remains valid under all the stopping rules.

The approximate likelihood \(L_{T}^{a}(\theta,\phi)\) is defined as

$$ L_{T}^{a}(\theta,\phi) = \prod_{i=1}^{A(T)} f(u_{i},\theta)\prod_{i=1}^{D(T)} g(v_{i},\phi). $$

The maximum likelihood estimates obtained from (5) are asymptotically equivalent to those obtained from (4) provided that the following two conditions are satisfied for T:

$$ \left(A(T)\right)^{-1/2} \frac{\partial}{\partial \theta} \text{log} \left[ 1-F_{\theta}\left(T- \sum_{i=1}^{A(T)}u_{i}\right)\right] \stackrel {p} \longrightarrow 0 $$


$$ \left(D(T)\right)^{-1/2} \frac{\partial}{\partial \phi} \text{log} \left[ 1-G_{\phi}\left(T - \gamma(T)- \sum_{i=1}^{D(T)}v_{i}\right)\right] \stackrel {p} \longrightarrow 0. $$

The implications of these conditions have been explained by Basawa and Prabhu (1988). From (5), we have the log-likelihood function

$$ \begin{aligned} \ell_{T}(\theta, \phi)=\text{log}L_{T}^{a}(\theta,\phi)=& \ell_{T}(u_{1},u_{2},...,u_{A(T)}; \theta) + \ell_{T}(v_{1}, v_{2},...,v_{D(T)}; \phi)\\ =& \sum_{i=1}^{A(T)}\ell(u_{i}; \theta)+ \sum_{i=1}^{D(T)} \ell(v_{i}; \phi) \end{aligned} $$


$$\ell(u_{i}; \theta)=\text{log}f(u_{i}; \theta)= \text{log}a_{1}(u_{i}) + \theta h_{1}(u_{i}) - k_{1}(\theta) $$


$$\ell(v_{i}; \phi)= \text{log}g(v_{i}; \phi)= \text{log} a_{2}(v_{i}) + \phi h_{2}(v_{i}) - k_{2}(\phi). $$


$$\ell_{T}^{(1)}(u_{1}, u_{2},...,u_{A(T)}; \theta) = \frac{\partial \ell_{T}(u_{1}, u_{2},...,u_{A(T)}; \theta)}{\partial \theta}, \quad \ell^{(1)}(u_{i}; \theta)=\frac{\partial \text{log}f(u_{i}; \theta)}{\partial \theta} $$


$$\ell_{T}^{(1)}(v_{1}, v_{2},...,v_{D(T)}; \phi) = \frac{\partial \ell_{T}(v_{1}, v_{2},...,v_{D(T)}; \phi)}{\partial \phi}, \quad \ell^{(1)}(v_{i}; \phi)=\frac{\partial \text{log}g(v_{i}; \phi)}{\partial \phi}. $$

The maximum likelihood estimators \(\hat {\theta }_{T}=\hat {\theta }_{T}(u_{1}, u_{2},...,u_{A(T)})\) and \(\hat {\phi }_{T}=\hat {\phi }_{T}(v_{1}, v_{2},...,v_{D(T)})\) of θ and ϕ are given by

$$\begin{array}{*{20}l} \hat{\theta}_{T}= \eta_{1}^{-1} \left[(A(T))^{-1}\sum_{i=1}^{A(T)}h_{1}(u_{i})\right], \end{array} $$
$$\begin{array}{*{20}l} \hat{\phi}_{T}= \eta_{2}^{-1}\left[(D(T))^{-1}\sum_{i=1}^{D(T)}h_{2}(v_{i})\right], \end{array} $$

respectively, where \(\eta _{i}^{-1}(.)\) denotes the inverse functions of ηi(.) for i=1,2 and

$$\eta_{1}(\theta)=E(h_{1}(u)) = k_{1}^{'}(\theta) $$


$$\eta_{2}(\phi)=E(h_{2}(v)) = k_{2}^{'}(\phi). $$

Since T(v1,v2,...,vD(T)),(vi;ϕ), and \(\hat {\phi }_{T}\) are of the same form as in the case of θ, hereafter we will deal with only for the arrival process (for the parameter θ). However, the same can be done for the departure process.


$$\underline{\theta}_{T}=\underline{\theta}_{T}(u_{1}, u_{2},..,u_{A(T)}) = \inf\left\{ \theta \in \theta: \ell^{(1)}_{T}(u_{1}, u_{2},..,u_{A(T)}; \theta) \leq 0\right\} $$


$$\bar{\theta}_{T}=\bar{\theta}_{T}(u_{1}, u_{2},..,u_{A(T)}) = \sup\left\{ \theta \in \theta: \ell^{(1)}_{T}(u_{1}, u_{2},..,u_{A(T)}; \theta) \geq 0\right\}. $$


$$\underline{\theta}_{T}(u_{1}, u_{2},..,u_{A(T)}) \leq \hat{\theta}_{T}(u_{1}, u_{2},..,u_{A(T)}) \leq \bar{\theta}_{T}(u_{1}, u_{2},..,u_{A(T)}) $$

and, for every ε>0,

$$P_{\theta}(\underline{\theta}_{T} \geq \theta +\varepsilon) \leq P_{\theta}\left(\ell_{T}^{(1)}(u_{1}, u_{2},..,u_{A(T)}; \theta+\varepsilon) \geq 0\right) \leq P_{\theta}(\bar{\theta}_{T} \geq \theta +\varepsilon) $$


$$P_{\theta}(\bar{\theta}_{T} \leq \theta - \varepsilon) \leq P_{\theta}\left(\ell_{T}^{(1)}(u_{1}, u_{2},..,u_{A(T)}; \theta-\varepsilon) \leq 0\right) \leq P_{\theta}(\underline{\theta}_{T} \leq \theta - \varepsilon). $$

We assume that the following conditions hold:

  1. (C1)

    For each θΘ, the derivatives

    $$\ell^{(i)}(u; \theta)= \frac{\partial^{i} \text{log} f(u; \theta)}{\partial \theta^{i}}, ~i=1, 2, 3 $$

    exist for all uR.

  2. (C2)

    For each θΘ, there exists a neighbourhood N(θ,δ) of θ for some δ>0 and non-negative measurable functions Ai(u;θ), i=1,2,3 such that

    $$\underset{u\in R}{\sup} {\int_{R}} {A_{i}^{3}(u; \theta) f(u; \theta) dy} < \infty, ~i=1,2,3 $$


    $$\underset{{\theta^{\prime}} \in N(\theta, \delta)}{\sup} |\ell^{(i)}(u; {\theta^{\prime}})| \leq A_{i}(u; \theta), ~i=1,2,3. $$
  3. (C3)

    For each θΘ, the probability density function f(u;θ) has a finite Fisher information, that is,

    $$0 \leq I(\theta)=E_{\theta}\left[ \left(\frac{\partial \text{log} f(u; \theta)}{\partial \theta}\right)^{2} \right] < \infty $$

    for all uR.

  4. (C4)

    For each θΘ, there exists μ=μ(θ)>0 and ν=ν(θ)>0 such that

    $$\underset{(x, \varepsilon) \in [-\mu, \mu] \times [-\nu, \nu]}{\sup} \phi(x; \theta, \varepsilon) < \infty, $$


    $$\phi(x; \theta, \varepsilon)= \underset{u}{\sup} E_{\theta} \left[ \text{exp} (x \ell^{(1)} (u_{1}; \theta+\varepsilon)) \right]. $$

Under conditions (C1)−(C3), it can be easily seen that, for all i≥1,

$$ E_{\theta}(\ell^{(1)}(u_{i}; \theta))=0 $$


$$ E_{\theta}(\ell^{(2)}(u_{i}; \theta))= -E_{\theta}[(\ell^{(1)}(u_{i}; \theta))^{2}]= -I(\theta). $$

Main results

In this section, we study the problem of moderate deviation, i.e., the rate of convergence of the probability \(P_{\theta }(\lambda (T)|\hat {\theta }_{T} - \theta | \geq \varepsilon)\), where

$$\lambda(T)\uparrow +\infty, \quad \frac{\lambda(T)}{\sqrt{A(T)}} \downarrow 0 \quad \text{as} \quad T \rightarrow \infty. $$

Theorem 1

Let conditions (C1) to (C4) hold, then

$$ \underset{T \rightarrow \infty}{\liminf} \frac{\lambda^{2}(T)}{A(T)} \text{log}P_{\theta} \left(\lambda(T)(\bar{\theta}_{T} - \theta) \geq \varepsilon \right) \geq - \frac{1}{2} I(\theta)\varepsilon^{2}, $$
$$ \underset{T \rightarrow \infty}{\liminf} \frac{\lambda^{2}(T)}{A(T)} \text{log}P_{\theta} \left(\lambda(T)(\underline{\theta}_{T} - \theta) \leq -\varepsilon \right) \geq - \frac{1}{2} I(\theta)\varepsilon^{2}, $$
$$ \underset{T \rightarrow \infty}{\limsup} \frac{\lambda^{2}(T)}{A(T)} \text{log}P_{\theta} \left(\lambda(T)(\underline{\theta}_{T} - \theta) \geq \varepsilon \right) \leq - \frac{1}{2} I(\theta)\varepsilon^{2}, $$


$$ \underset{T \rightarrow \infty}{\limsup} \frac{\lambda^{2}(T)}{A(T)} \text{log}P_{\theta} \left(\lambda(T)(\bar{\theta}_{T} - \theta) \leq -\varepsilon \right) \leq - \frac{1}{2} I(\theta)\varepsilon^{2}. $$


$$ \underset{T \rightarrow \infty}{\lim} \frac{\lambda^{2}(T)}{A(T)} \text{log}P_{\theta} \left(\lambda(T)| \hat{\theta}_{T} - \theta | \leq \varepsilon \right) = - \frac{1}{2} I(\theta)\varepsilon^{2}. $$

Theorem 2

Let conditions (C1) to (C4)hold, then, for any closed subset \(\mathcal {F} \subset \Theta \), we have

$$ \underset{T \rightarrow \infty}{\limsup} \frac{\lambda^{2}(T)}{A(T)} \text{log}P_{\theta} \left(\lambda(T)(\hat{\theta}_{T} - \theta) \in \mathcal{F} \right) \leq - \frac{1}{2} I(\theta) \inf\limits_{x\in \mathcal{F}} \omega^{2}, $$

and, for any open subset \( \mathcal {G} \subset \Theta \),

$$ \underset{T \rightarrow \infty}{\liminf} \frac{\lambda^{2}(T)}{A(T)} \text{log}P_{\theta} \left(\lambda(T)(\hat{\theta}_{T} - \theta) \in \mathcal{G} \right) \geq - \frac{1}{2} I(\theta) \underset{x\in \mathcal{G}}{\inf} \omega^{2}, $$

and, for any ε>0,

$$ \underset{T \rightarrow \infty}{\lim} \frac{\lambda^{2}(T)}{A(T)} \text{log}P_{\theta} \left(\lambda(T)| \hat{\theta}_{T} - \theta | \geq \varepsilon \right) = - \frac{1}{2} I(\theta) \varepsilon^{2}. $$

To prove the above theorems, we need the following key lemma.

Lemma 1

$$ \underset{T \rightarrow \infty}{\lim} \frac{\lambda^{2}(T)}{A(T)} \text{log}P_{\theta} \left(\ell_{T}^{(1)}(u_{1}, u_{2},...,u_{A(T)}; \theta + \frac{\varepsilon}{\lambda(T)}) \geq 0 \right) = - \frac{1}{2} I(\theta)\varepsilon^{2} $$


$$ \underset{T \rightarrow \infty}{\lim} \frac{\lambda^{2}(T)}{A(T)} \text{log}P_{\theta} \left(\ell_{T}^{(1)}(u_{1}, u_{2},...,u_{A(T)}; \theta - \frac{\varepsilon}{\lambda(T)}) \leq 0 \right) = - \frac{1}{2} I(\theta)\varepsilon^{2}. $$


From Taylor’s expansion of (u;θ) within the neighbourhood N(θ,δ), we have

$$\sup_{u \in R} \left|\ell^{(1)}(u; \gamma) - \ell^{(1)}(u; \theta) - (\gamma -\theta)\ell^{(2)}(u; \theta) \right| \leq \frac{1}{2} (\gamma -\theta)^{2} A_{3}(u; \theta). $$

Hence, for any i≥1,

$$\left |\ell^{(1)} \left(u_{i}; \theta + \frac{\varepsilon}{\lambda(T)} \right) - \ell^{(1)} (u_{i}; \theta) - \frac{\varepsilon}{\lambda(T)} \ell^{(2)}(u_{i}; \theta) \right| \leq \frac{\varepsilon^{2}}{2\lambda^{2}(T)} A_{3}(u_{i}; \theta). $$

Thus, by the condition (C2) and the Eqs. (9) and (10), it follows that

$$ \begin{aligned} {E_{\theta}} \left[{\ell^{(1)}} \left(u_{i}; \theta + \frac{\varepsilon}{\lambda(T)} \right) \right] &\,=\, E_{\theta}\left[{\ell^{(1)}} (u_{i}; \theta)\right] \,+\, \frac{\varepsilon}{\lambda(T)} E_{\theta}\left[{\ell^{(2)}}(u_{i}; \theta)\right] +o\left(\frac{1}{\lambda(T)}\right)\\ &= -I(\theta) \frac{\varepsilon}{\lambda(T)}+o\left(\frac{1}{\lambda(T)}\right). \end{aligned} $$

Therefore, it follows that

$$ \begin{aligned} P_{\theta} &\left(\ell_{T}^{(1)}\left(u_{1}, u_{2},...,u_{A(T)}; \theta + \frac{\varepsilon}{\lambda(T)} \right) \geq 0 \right) \\ &=P_{\theta} \left[ \frac{\lambda(T)}{A(T)} \sum_{i=1}^{A(T)} \left(\ell^{(1)}\left(u_{i}; \theta + \frac{\varepsilon}{\lambda(T)}\right) - E_{\theta} \left(\ell^{(1)} \left(u_{i}; \theta + \frac{\varepsilon}{\lambda(T)} \right) \right) \right) \right.\\ & \quad \quad \geq \left.- \frac{\lambda(T)}{A(T)} \sum_{i=1}^{A(T)} E_{\theta} \left(\ell^{(1)} \left(u_{i}; \theta + \frac{\varepsilon}{\lambda(T)}\right) \right) \right] \\ &=P_{\theta} \left[\frac{\lambda(T)}{A(T)} \sum_{i=1}^{A(T)} \left(\ell^{(1)} \left(u_{i}; \theta + \frac{\varepsilon}{\lambda(T)} \right) - E_{\theta} \left(\ell^{(1)} \left(u_{i}; \theta + \frac{\varepsilon}{\lambda(T)} \right) \right) \right) \right. \\ & \quad \quad \left.{\phantom{\sum_{i=1}^{A(T)}}}\geq I(\theta)\varepsilon +o(1) \right] \quad \quad (\text{using equation}\ (21)). \end{aligned} $$

We now compute the Cramér functional

$${\begin{aligned} &{\lim}_{T \rightarrow \infty} \frac{\lambda^{2}(T)}{A(T)} \text{log}E_{\theta} \left\{\text{exp} \left(\frac{x}{\lambda(T)} {\sum_{i=1}^{A(T)}} \left[{\ell^{(1)}}\left(u_{i}; \theta + \frac{\varepsilon}{\lambda(T)} \right)\right.\right.\right. \\&\left. \left. \left. \quad \quad \quad- {E_{\theta}}\left({\ell^{(1)}} \left(u_{i}; \theta + \frac{\varepsilon}{\lambda(T)} \right) \right) \right] {\vphantom{\frac{x}{\lambda(T)} {\sum_{i=1}^{A(T)}}}}\right) \right\} \end{aligned}} $$

for any xR. Applying Taylor’s expansion, the condition (C4), and Eqs. (9), (10), (21) it follows that, for every i≥1, and for xR,

$$ \begin{aligned} E_{\theta}&\left\{\text{exp} \left(\frac{x}{\lambda(T)} \ell^{(1)} \left(u_{i}; \theta + \frac{\varepsilon}{\lambda(T)}\right)\right)\right\} \\ = &1+ \frac{x}{\lambda(T)} E_{\theta} \left[\ell^{(1)} \left(u_{i}; \theta + \frac{\varepsilon}{\lambda(T)} \right) \right] \\ & \quad + \frac{x^{2}}{2\lambda^{2}(T)} E_{\theta}\left(\left[ \ell^{(1)}(u_{i}; \theta + \frac{\varepsilon}{\lambda(T)}) \right]^{2} \right) +o \left(\frac{1}{\lambda^{2}(T)} \right) \\ =& 1 + \frac{1}{\lambda^{2}(T)} \left(\frac{x^{2}}{2} I(\theta) - x \varepsilon I(\theta) \right) + o\left(\frac{1}{\lambda^{2}(T)} \right). \end{aligned}\vspace*{18pt} $$

Thus, it follows that

$$ \begin{aligned} \text{log}& E_{\theta} \text{exp} \left(\frac{x}{\lambda(T)} \left[\sum_{i=1}^{A(T)} \ell^{(1)} \left(u_{i}; \theta +\frac{\varepsilon}{\lambda(T)} \right) \,-\, \sum_{i=1}^{A(T)} E_{\theta} \left(\ell^{(1)} \left(u_{i}; \theta +\frac{\varepsilon}{\lambda(T)}\right)\right) \right] \right) \\ =& \text{log} \left[E_{\theta} \left\{\text{exp} \left(\frac{x}{\lambda(T)} \sum_{i=1}^{A(T)} \ell^{(1)}\left(u_{i}; \theta+\frac{\varepsilon}{\lambda(T)}\right) \right) \right\} \right] \\ & \quad \quad \quad + \left(-\frac{x}{\lambda(T)} \sum_{i=1}^{A(T)} E_{\theta} \left(\ell^{(1)}\left(u_{i}; \theta+\frac{\varepsilon}{\lambda(T)}\right) \right) \right) \\ =& \text{log} \left[\prod_{i=1}^{A(T)}E_{\theta} \left\{\text{exp} \left(\frac{x}{\lambda(T)} \ell^{(1)}\left(u_{i}; \theta+\frac{\varepsilon}{\lambda(T)}\right) \right) \right\} \right] \\ & \quad \quad \quad + \left(-\frac{x}{\lambda(T)} \sum_{i=1}^{A(T)} E_{\theta} \left(\ell^{(1)}\left(u_{i}; \theta+\frac{\varepsilon}{\lambda(T)}\right) \right) \right) \\ =& \sum_{i=1}^{A(T)} \text{log} \left[E_{\theta} \left\{\text{exp} \left(\frac{x}{\lambda(T)} \ell^{(1)}\left(u_{i}; \theta+\frac{\varepsilon}{\lambda(T)}\right) \right) \right\} \right] \\ & \quad \quad \quad + \left(-\frac{x}{\lambda(T)} \sum_{i=1}^{A(T)} E_{\theta} \left(\ell^{(1)}\left(u_{i}; \theta+\frac{\varepsilon}{\lambda(T)}\right) \right) \right) \end{aligned} $$
$$ \begin{aligned} =& A(T)\text{log} \left[1 + \frac{1}{\lambda^{2}(T)} \left(\frac{x^{2}}{2} I(\theta) - x \varepsilon I(\theta) \right) + o\left(\frac{1}{\lambda^{2}(T)} \right) \right] \\ & \quad \quad + \left(A(T) I(\theta) \frac{x \varepsilon} {\lambda^{2}(T)} + o\left(\frac{1}{\lambda^{2}(T)}\right) \right) \\ =& A(T) \left[ \frac{1}{\lambda^{2}(T)} \left(\frac{x^{2}}{2} I(\theta) - x \varepsilon I(\theta) \right) + o\left(\frac{1}{\lambda^{2}(T)} \right) \right] \\&\quad+ \left(A(T) I(\theta) \frac{x \varepsilon} {\lambda^{2}(T)} + o\left(\frac{1}{\lambda^{2}(T)}\right) \right) \\ =& \frac{A(T)}{\lambda^{2}(T)} \frac{x^{2}}{2} I(\theta) + o\left(\frac{1}{\lambda^{2}(T)}\right) \end{aligned} $$

Thus, we get that

$${\begin{aligned} &{\lim}_{T \rightarrow \infty} \frac{\lambda^{2}(T)}{A(T)} \text{log}E_{\theta} \left\{\text{exp} \left(\frac{x}{\lambda(T)} \sum_{i=1}^{A(T)} \left[\ell^{(1)}\left(u_{i}; \theta + \frac{\varepsilon}{\lambda(T)} \right) \right. \right. \right.\\&\qquad \left. \left. \left. - E_{\theta}\left(\ell^{(1)} \left(u_{i}; \theta + \frac{\varepsilon}{\lambda(T)} \right) \right) \right] {\vphantom{\lambda(T)} \sum_{i=1}^{A(T)}}\right) \right\}= \frac{I(\theta)}{2} x^{2}. \end{aligned}} $$

Hence, by the Gärtner-Ellis theorem (cf. Gärtner (1977); Ellis (1984); Dembo and Zeitouni (1998, Theorem 2.3.6, page 44), we prove the result in Eq. (19) of Lemma 1. The result in Eq. (20) can be proved in similar manner. □

Proof of Theorem 1 Note that

$$P_{\theta}\left(\lambda(T) (\bar{\theta}_{T} - \theta) \geq \varepsilon\right) \geq P_{\theta}\left(\ell_{T}^{(1)} \left(u_{1}, u_{2},...,u_{A(T)}; \theta +\frac{\varepsilon}{\lambda(T)}\right) \geq 0 \right) $$


$$P_{\theta}\left(\lambda(T) (\underline{\theta}_{T} - \theta) \leq \varepsilon\right) \geq P_{\theta}\left(\ell_{T}^{(1)} \left(u_{1}, u_{2},...,u_{A(T)}; \theta +\frac{\varepsilon}{\lambda(T)}\right) \geq 0 \right). $$

Then, from Lemma 1, it is seen that the relation (11) and (13) hold. Using the similar arguments, (12), (14), and (15) can be proved. □

Proof of Theorem 2 For a fixed close subset \( \mathcal {F} \subset \Theta \subset R\), define \(\omega _{1}=\inf \{ \omega >0; \omega \in \mathcal {F}\}\) and \(\omega _{2}=\sup \{ \omega < 0; \omega \in \mathcal {F}\}\). Let \(I(\omega : \theta)=\frac {1}{2} I(\theta) \omega ^{2}\). Then,

$$ \begin{aligned} \underset{T \rightarrow \infty}{\limsup} & \frac{\lambda^{2}(T)}{A(T)} \text{log}P_{\theta} \left(\lambda(T)(\hat{\theta}_{T} - \theta) \in \mathcal{F} \right) \\ \leq & \underset{T \rightarrow \infty}{\limsup} \frac{\lambda^{2}(T)}{A(T)} \text{log} \left(P_{\theta} \left(\lambda(T)(\hat{\theta}_{T} \,-\, \theta) \leq \omega_{2} \right) \,+\, \text{log}P_{\theta} \left(\lambda(T)(\hat{\theta}_{T} - \theta) \geq \omega_{1} \right) \right) \\ \leq & \text{max} \{ - I(\omega_{2}, \theta), -I(\omega_{1}, \theta) \} = -\underset{\omega \in \mathcal{F}}{\inf} I(\omega: \theta)= -\frac{1}{2} I(\theta) \underset{\omega \in \mathcal{F}}{\inf} \omega^{2}. \end{aligned} $$

Next, for a given open subset \( \mathcal {G} \subset \Theta \subset R\), for \( \forall \omega \in \mathcal {G}, \forall \varepsilon >0\) such that \((\omega -\varepsilon, \omega +\varepsilon) \subset \mathcal {G}\), we have

$$ \begin{aligned} P_{\theta} \left(\lambda(T)(\hat{\theta}_{T} - \theta) \in \mathcal{G} \right) \geq & P_{\theta} \left(\omega-\varepsilon \leq \lambda(T)(\hat{\theta}_{T} - \theta) \leq \omega+\varepsilon \right) \\ = & P_{\theta} \left(\sum_{i=1}^{A(T)} \ell^{(1)} \left(u_{i}; \theta + \frac{\omega+\varepsilon}{\lambda(T)} \right)\right. \\ & \left. \quad \leq 0, \sum_{i=1}^{A(T)} \ell^{(1)} \left(u_{i}; \theta + \frac{\omega - \varepsilon}{\lambda(T)} \right) \geq 0 \right). \end{aligned} $$

From Eq. (6), we have

$$S_{A(T)}= \sum_{i=1}^{A(T)} \ell^{(1)} (u_{i}; \theta)=\sum_{i=1}^{A(T)}h_{1}(u_{i})-A(T)k_{1}(\theta). $$

It is easy to see that the sequence {SA(T), A(T)≥1} is a square integrable martingale with zero mean. Using the martingale central limit theorem (see Hall and Heyde (1980)), we have

$$P_{\theta} \left(\frac{1}{\sqrt{A(T)}} \sum_{i=1}^{A(T)} \ell^{(1)} (u_{i}; \theta) \geq \eta x \right) \rightarrow 1 - \Phi(\frac{\eta x}{\gamma_{\theta}}) $$

for some γθ>0. Using similar arguments as in the Proof of Theorem 1, we have, x>0 and η>0,

$$ \begin{aligned} \underset{T \rightarrow \infty}{\liminf} & P_{\theta} \left(\lambda(T) (\hat{\theta}_{T} - \theta) \in \mathcal{G} \right) \\ \geq &\frac{1}{x^{2}} \text{log} \left(\Phi\left(\frac{x (\eta + I(\theta) (\omega+\varepsilon))} {\sigma_{\theta}} \right) - \Phi\left(\frac{x (\eta + I(\theta) (\omega-\varepsilon))} {\sigma_{\theta}} \right) \right). \end{aligned} $$

Now, letting η→0 first, then x, and finally ε→0, we get

$$\underset{T \rightarrow \infty}{\liminf} P_{\theta} \left(\lambda(T)(\hat{\theta}_{T} - \theta) \in \mathcal{G} \right) \geq - \frac{1}{2} I(\theta) \omega^{2}, $$

which completes the proof, since we have chosen an arbitrary ω in \(\mathcal {G}\) in the above discussion. □


M/M/1 queue

Let us consider the simplest of the queueing models used in practice, that is, an M/M/1 queue. The arrivals are assumed to occur in a Poisson process with rate θ and the service time distribution follows exponential distribution with mean 1/ϕ. Here,

$$ \begin{aligned} f(u, \theta)=\theta e^{-\theta u} \quad \text{and} \quad g(v, \phi)=\phi e^{-\phi v}. \end{aligned} $$

The log-likelihood function becomes

$$\ell_{T}(\theta, \phi)= A(T)\text{log}~\theta - \theta \sum_{i=1}^{A(T)}u_{i} + D(T) \text{log}~\phi - \phi\sum_{i=1}^{D(T)} v_{i} $$

and the maximum likelihood estimators are

$$ \begin{aligned} \hat{\theta}_{T}=\left[\frac{\sum_{i=1}^{A(T)} u_{i}} {A(T)}\right]^{-1}, &~ \hat{\phi}_{T}=\left[\frac{\sum_{i=1}^{D(T)} v_{i}} {D(T)} \right]^{-1}. \end{aligned} $$

For this example, the logarithm of interarrival density function is (ui;θ)=logθθui, and the first three derivatives are \(\ell ^{(1)}(u_{i}; \theta)= \frac {1}{\theta } - u_{i}, \ell ^{(2)}(u_{i}; \theta)= - \frac {1}{\theta ^{2}}\), and \(\ell ^{(3)}(u_{i}; \theta)= \frac {2}{\theta ^{3}}\). It is easy to see that conditions (C1) to (C4) hold. Hence, the results of Section 3 hold for \(\hat {\theta }_{T}\), the maximum likelihood estimator of θ with the Fisher information \(I(\theta)=\frac {1}{\theta ^{2}}\).

Ek/M/1 queue

Here, we consider the Ek/M/1 queueing model. The interarrival and service time densities are

$$ \begin{aligned} f(u, \theta)= \frac{\theta(\theta u)^{k-1}e^{-\theta u} }{(k-1)!} \quad \text{and} \quad g(v, \phi)=\phi e^{-\phi v}. \end{aligned} $$

The log-likelihood function is

$${\begin{aligned} \ell_{T}(\theta, \phi)= k A(T)\text{log}~\theta +(k-1)\sum_{i=1}^{A(T)}\text{log}u_{i} &- \theta \sum_{i=1}^{A(T)}u_{i} - A(T)\text{log}[(k-1)!]\\&+ D(T) \text{log}~\phi - \phi\sum_{i=1}^{D(T)} v_{i} \end{aligned}} $$

and the maximum likelihood estimators are

$$ \begin{aligned} \hat{\theta}_{T}=\left[\frac{\sum_{i=1}^{A(T)} u_{i}} {k A(T)}\right]^{-1}, &~ \hat{\phi}_{T}=\left[\frac{\sum_{i=1}^{D(T)} v_{i}} {D(T)} \right]^{-1}. \end{aligned} $$

The logarithm of interarrival density function is

$$\ell(u_{i}; \theta)= k \text{log}~\theta +(k-1) \text{log}u_{i} - \theta u_{i} - \text{log}[(k-1)!]. $$

So, \(\ell ^{(1)}(u_{i}; \theta)= \frac {k}{\theta } - u_{i}, \ell ^{(2)}(u_{i}; \theta)= - \frac {k}{\theta ^{2}}, \ell ^{(3)}(u_{i}; \theta)= \frac {2k}{\theta ^{3}}\), and the Fisher information \(I(\theta)=\frac {k}{\theta ^{2}}\). One can easily find that conditions (C1) to (C4) hold. Hence, the results of Section 3 hold for \(\hat {\theta }_{T}\), the maximum likelihood estimator of θ.

Availability of data and materials

Data sharing is not applicable to this article as no datasets were generated or analyzed during the current study.


  1. Acharya, S. K.: On normal approximation for Maximum likelihood estimation from single server queues. Queueing Syst. 31, 207–216 (1999).

    MathSciNet  Article  Google Scholar 

  2. Acharya, S. K., Singh, S. K.: Asymptotic properties of maximum likelihood estimators from single server queues: A martingale approach. Commun. Stat. Theory Methods. 48, 3549–3557 (2019).

    MathSciNet  Article  Google Scholar 

  3. Basawa, I. V., Prabhu, N. U.: Estimation in single server queues. Naval. Res. Logist. Quart. 28, 475–487 (1981).

    MathSciNet  Article  Google Scholar 

  4. Basawa, I. V., Prabhu, N. U.: Large sample inference from single server queues. Queueing Syst. 3, 289–304 (1988).

    MathSciNet  Article  Google Scholar 

  5. Billingsley, P.: Statistical Inference for Markov Processes. The University of Chicago Press, Chicago (1961).

    Google Scholar 

  6. Clarke, A. B.: Maximum likelihood estimates in a simple queue. Ann. Math. Statist. 28, 1036–1040 (1957).

    MathSciNet  Article  Google Scholar 

  7. Cox, D. R.: Some problems of statistical analysis connected with congestion. In: Smith, W. L., Wilkinson, W. B. (eds.)Proceedings of the Symposium on Congestion Theory, pp. 289–316. University of North Carolina Press, Chapel Hill (1965).

    Google Scholar 

  8. Dembo, A., Zeitouni, O.: Large deviation Techniques and Applications. 2nd edn. Springer, New York (1998).

    Google Scholar 

  9. Ellis, R. S.: Large deviations for a general class of random vectors. Ann. Probab. 12, 1–12 (1984).

    MathSciNet  Article  Google Scholar 

  10. Gärtner, J: On large deviations from the invariant measure. Theory Probab. Appl. 22, 24–39 (1977).

    MathSciNet  Article  Google Scholar 

  11. Gao, F.: Moderate deviations for the maximum likelihood estimator. Stat. Probab. Lett. 55, 345–352 (2001).

    MathSciNet  Article  Google Scholar 

  12. Goyal, T. L., Harris, C. M.: Maximum likelihood estimation for queues with state dependent service. Sankhya Ser. A. 34, 65–80 (1972).

    MathSciNet  MATH  Google Scholar 

  13. Hall, P., Heyde, C. C.: Martingale Limit Theory and Applications. Academic Press, New York (1980).

    Google Scholar 

  14. Miao, Y., Chen, Y. -X.: Note on moderate deviations for the maximum likelihood estimator. Acta Appl. Math. 110, 863–869 (2010).

    MathSciNet  Article  Google Scholar 

  15. Miao, Y., Wang, Y.: Moderate deviation principle for maximum likelihood estimator. Statistics. 48, 766–777 (2014).

    MathSciNet  Article  Google Scholar 

  16. Singh, S. K., Acharya, S. K.: Equivalence between Bayes and the maximum likelihood estimator in M/M/1 queue. Commun. Stat.–Theory Methods. 48, 4780–4793 (2019).

    MathSciNet  Article  Google Scholar 

  17. Wolff, R. W.: Problems of statistical inference for birth and death queueing models. Oper. Res. 13, 243–357 (1965).

    MathSciNet  Article  Google Scholar 

  18. Xiao, Z., Liu, L.: Moderate deviations of maximum likelihood estimator for independent not identically distributed case. Stat. Probab. Lett. 76, 1056–1064 (2006).

    MathSciNet  Article  Google Scholar 

Download references


The author would like to thank the reviewers and the editor for their helpful comments. Also, the author gives thanks to his PhD supervisor, Retired Professor S. K. Acharya, Sambalpur University, Odisha, India, for the discussions and suggestions to improve the paper.


The author received no specific funding for this work.

Author information




The author read and approved the final manuscript.

Corresponding author

Correspondence to Saroja Kumar Singh.

Ethics declarations

Competing interests

The author declares that he has no competing interest.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Singh, S.K. Moderate deviation for maximum likelihood estimators from single server queues. Probab Uncertain Quant Risk 5, 2 (2020).

Download citation


  • GI/G/1 queue
  • Maximum likelihood estimator
  • Fisher information
  • Moderate deviation

Mathematics Subject Classification

  • 60K25
  • 62F12