Bapat–Beg theorem

In probability theory, the Bapat–Beg theorem gives the joint probability distribution of order statistics of independent but not necessarily identically distributed random variables in terms of the cumulative distribution functions of the random variables. Bapat and Beg published the theorem in 1989,[1] though they did not offer a proof. A simple proof was offered by Hande in 1994.[2]

Often, all elements of the sample are obtained from the same population and thus have the same probability distribution. The Bapat–Beg theorem describes the order statistics when each element of the sample is obtained from a different statistical population and therefore has its own probability distribution.[1]

Statement of theorem

Let X_1,X_2,\ldots, X_n be independent real valued random variables with cumulative distribution functions respectively F_1(x),F_2(x),\ldots,F_n(x). Write X_{(1)},X_{(2)},\ldots, X_{(n)} for the order statistics. Then the joint probability distribution of the n_1,n_2,\ldots, n_k order statistics (with n_1<n_2<\ldots < n_k and x_1<x_2<\ldots< x_k) is

\begin{align} 
F_{X_{(n_1)},\ldots, X_{(n_k)}}(x_1,\ldots,x_k)
& =\Pr ( X_{(n_1)}\leq x_1 \and X_{(n_2)}\leq x_2 \and\ldots\and X_{(n_k)} \leq x_k) \\
& =\sum_{i_k=n_k}^n \ldots\sum_{i_2=n_2}^{i_3}\,\sum _{i_1=n_1}^{i_2}\frac{P_{i_1,\ldots,i_k} (x_1,\ldots ,x_k)}{i_1! (i_2-i_1)! \ldots  (n-i_k)!}, \end{align}

where

P_{i_1,\ldots,i_k}(x_1,\ldots,x_k) =

\operatorname{per}
\begin{bmatrix}
F_1(x_1)                \ldots  F_1(x_1)               & 
F_1(x_2)-F_1(x_1) \ldots  F_1(x_2)-F_1(x_1) &  \ldots & 
1-F_1(x_k)             \ldots  1-F_1(x_k)  \\

F_2(x_1)                 \ldots  F_2(x_1)                & 
F_2(x_2)-F_2(x_1)  \ldots  F_2(x_2)-F_2(x_1)  &  \ldots & 
1-F_2(x_k)             \ldots  1-F_1(x_k  )\\

\vdots                                                            &    
\vdots                                                            &             &  
\vdots                                                  \\

\underbrace{F_n(x_1)               \ldots  F_n(x_1)              }_{i_1}   & 
\underbrace{F_n(x_2)-F_n(x_1) \ldots F_n(x_2)-F_n(x_1)}_{i_2-i_1} &  \ldots & 
\underbrace{1-F_n(x_k)            \ldots 1-F_n(x_k)           }_{n-i_k}
\end{bmatrix}

is the permanent of the given block matrix. (The figures under the braces show the number of columns.)[1]

Independent identically distributed case

In the case when the variables X_1,X_2,\ldots, X_n are independent and identically distributed with cumulative probability distribution function F_i=F for all i the theorem reduces to


\begin{align}
& F_{X_{(n_1)},\ldots, X_{(n_k)}}(x_1,\ldots,x_k) \\[8pt]
& =\sum_{i_k=n_k}^n \cdots \sum_{i_2=n_2}^{i_3}\,\sum_{i_1=n_1}^{i_2} m! \frac{F(x_1)^{i_1}}{i_1!} \frac{(1-F(x_k))^{m-i_k}}{(m-i_k)!} \prod\limits_{j=2}^k \frac{\left[  F(x_j) -F(x_{j-1}) \right]^{i_j-i_{j-1}} }{(i_j-i_{j-1})!}.
\end{align}

Remarks

Complexity

Glueck et al. note that the BapatBeg "formula is computationally intractable, because it involves an exponential number of permanents of the size of the number of random variables"[3] However, when the random variables have only two possible distributions, the complexity can be reduced to O(m2k).[3] Thus, in the case of two populations, the complexity is polynomial in m for any fixed number of statistics k.

References

  1. 1.0 1.1 1.2 1.3 Bapat, R. B.; Beg, M. I. (1989). "Order Statistics for Nonidentically Distributed Variables and Permanents". Sankhyā: The Indian Journal of Statistics, Series A (1961-2002) 51 (1): 79–93. JSTOR 25050725. MR 1065561.
  2. 2.0 2.1 Hande, Sayaji (1994). "A Note on Order Statistics for Nondentically Distributed Variables". Sankhyā: The Indian Journal of Statistics, Series A (1961-2002) 56 (2): 365–368. JSTOR 25050995. MR 1664921.
  3. 3.0 3.1 Glueck; Anis Karimpour-Fard; Jan Mandel; Larry Hunter; Muller (2008). "Fast computation by block permanents of cumulative distribution functions of order statistics from several populations". Communications in Statistics - Theory and Methods 37 (18): 2815–2824. arXiv:0705.3851. doi:10.1080/03610920802001896. PMC 2768298. PMID 19865590.