Confidence Intervals for Wishart Random Matrices

In this post we briefly describe Wishart-distributed random matrices and a result from (Chiani, 2017), which provides an algorithm for calculating the probability that a standard Wishart-distributed matrix's eigenvalues lie within a given interval. The functions described in this post are implemented in Python here, including Algorithm 1 from (Chiani, 2017). We also talk about quantile functions and non-standard Wishart-distributed matrices.

The Wishart distribution

Suppose we have the p×pp\times p random matrix

A=i=1nxixiT,\begin{equation*} \bm{A} = \sum_{i=1}^n \bm{x}_i\bm{x}_i^T, \end{equation*}

where npn\geq p and each xiRp\bm{x}_i\in\mathbb{R}^p is independently drawn from a zero-mean multivariate normal distribution

xiNp(0,Σ)\begin{equation*} \bm{x}_i \sim \mathcal{N}_p(\bm{0},\bm{\Sigma}) \end{equation*}

with positive definite covariance matrix Σ\bm{\Sigma}. Then A\bm{A} is Wishart-distributed, and we write

AWp(Σ,n),\begin{equation*} \bm{A} \sim \mathcal{W}_p(\bm{\Sigma},n), \end{equation*}

where nn is called the degrees of freedom. We refer to the special case Wp(Ip,n)\mathcal{W}_p(\bm{I}_p,n) as the standard Wishart distribution, where Ip\bm{I}_p is the p×pp\times p identity matrix.

The Wishart distribution arises as the conjugate prior to the inverse covariance matrix of a multivariate normal distribution in Bayesian statistics, among other places.

One property of the Wishart distribution that we will make use of later is that if AWp(Σ,n)\bm{A}\sim\mathcal{W}_p(\bm{\Sigma},n) and CRp×p\bm{C}\in\mathbb{R}^{p \times p} is a constant, full-rank matrix, then

CACTWp(CΣCT,n).\begin{equation}\label{1} \bm{C}\bm{A}\bm{C}^T \sim \mathcal{W}_p(\bm{C}\bm{\Sigma}\bm{C}^T,n). \end{equation}

Confidence bounds on the eigenvalues of standard Wishart matrices

Suppose we want to compute the probability that the eigenvalues of some standard Wishart-distributed matrix

SWp(Ip,n)\begin{equation*} \bm{S} \sim \mathcal{W}_p(\bm{I}_p,n) \end{equation*}

are contained in a given interval [a,b][a,b], with 0ab0\leq a\leq b. We denote this probability as

Pr[aλmin(S),λmax(S)b],\begin{equation}\label{2} \mathrm{Pr}[a\leq\lambda_{\min}(\bm{S}), \lambda_{\max}(\bm{S})\leq b], \end{equation}

where λmin(S)\lambda_{\min}(\bm{S}) and λmax(S)\lambda_{\max}(\bm{S}) are the minimum and maximum eigenvalues of S\bm{S}, respectively. The probability (2)\eqref{2} is equivalent to

Pr[aIpSbIp]\begin{equation}\label{3} \mathrm{Pr}[a\bm{I}_p \preccurlyeq \bm{S} \preccurlyeq b\bm{I}_p] \end{equation}

(see Appendix A.5.2 of (Boyd & Vandenberghe, 2004)), where the notation AB\bm{A}\preccurlyeq\bm{B} means that BA\bm{B}-\bm{A} is positive semidefinite.

Algorithm 1 of (Chiani, 2017) tells us how to compute the function ψ(a,b)\psi(a,b) such that

Pr[aIpSbIp]=ψ(a,b),\begin{equation*} \mathrm{Pr}[a\bm{I}_p \preccurlyeq \bm{S} \preccurlyeq b\bm{I}_p] = \psi(a, b), \end{equation*}

and I've implemented this algorithm in Python here. This immediately gives us the cumulative density functions (CDFs) of the minimum and maximum eigenvalues of S\bm{S}:

Pr[aλmin(S)]=ψ(a,)=1Cmin(a),Pr[λmax(S)b]=ψ(0,b)=Cmax(b),\begin{align*} \mathrm{Pr}[a\leq\lambda_{\min}(\bm{S})] &= \psi(a, \infty) = 1 - C_{\min}(a), \\ \mathrm{Pr}[\lambda_{\max}(\bm{S})\leq b] &= \psi(0, b) = C_{\max}(b), \end{align*}

where CminC_{\min} and CmaxC_{\max} are the CDFs for the minimum and maximum eigenvalues, respectively.

Quantile functions

We now have a method for computing the CDFs of the eigenvalues, but we do not have the inverse CDFs (i.e., the quantile functions), which compute the bounds required to achieve a given probability. For example, suppose we want to find the value b0b\geq0 such that λmax(S)b\lambda_{\max}(\bm{S})\leq b with a given probability ρ[0,1]\rho\in[0,1]. One way to do this is by solving the optimization problem

argminb0(ρCmax(b))2,\begin{equation*} \mathrm{argmin}_{b\geq0} (\rho - C_{\max}(b))^2, \end{equation*}

which is also implemented here, using one of scipy's built-in solvers.

Non-standard Wishart matrices

We can extend the results above to the non-standard Wishart-distributed matrix

AWp(Σ,n).\begin{equation*} \bm{A} \sim \mathcal{W}_p(\bm{\Sigma},n). \end{equation*}

In particular, generalizing (3)\eqref{3} we get

Pr[aΣAbΣ]=ψ(a,b).\begin{equation}\label{4} \mathrm{Pr}[a\bm{\Sigma} \preccurlyeq \bm{A} \preccurlyeq b\bm{\Sigma}] = \psi(a,b). \end{equation}

To see this, let L\bm{L} be the Cholesky decomposition of Σ\bm{\Sigma}, such that LLT=Σ\bm{L}\bm{L}^T=\bm{\Sigma}. Since Σ\bm{\Sigma} is positive definite, L\bm{L} is full-rank and invertible. Then from (1)\eqref{1} we know that L1ALTWp(L1ΣLT,n)=Wp(Ip,n)\bm{L}^{-1}\bm{A}\bm{L}^{-T}\sim\mathcal{W}_p(\bm{L}^{-1}\bm{\Sigma}\bm{L}^{-T},n)=\mathcal{W}_p(\bm{I}_p,n), which means that Pr[aIpL1ALTbIp]=ψ(a,b)\mathrm{Pr}[a\bm{I}_p\preccurlyeq\bm{L}^{-1}\bm{A}\bm{L}^{-T}\preccurlyeq b\bm{I}_p]=\psi(a,b). Finally, since aIpL1ALTbIpa\bm{I}_p\preccurlyeq\bm{L}^{-1}\bm{A}\bm{L}^{-T}\preccurlyeq b\bm{I}_p if and only if aΣAbΣa\bm{\Sigma}\preccurlyeq\bm{A}\preccurlyeq b\bm{\Sigma} (this follows from Observation 7.1.8 of (Horn & Johnson, 2013)), we obtain (4)\eqref{4}.

Notice that (4)\eqref{4} is no longer expressed in terms of eigenvalues of A\bm{A}, but only expresses the probability that A\bm{A} is bounded by two matrices. This could be useful for formulating chance constraints on a positive definite matrix in convex optimization.

Thanks to Abhishek Goudar for reading a draft of this.

References

  • Boyd, Stephen and Lieven Vandenberghe (2004). Convex Optimization. Cambridge University Press. (Freely available here.)
  • Chiani, M. (2017). "On the Probability That All Eigenvalues of Gaussian, Wishart, and Double Wishart Random Matrices Lie Within an Interval," in IEEE Transactions on Information Theory, vol. 63, no. 7, pp. 4521–4531, July 2017, doi: 10.1109/TIT.2017.2694846, arxiv: 1502.04189.
  • Horn, Roger A. and Charles R. Johnson (2013). Matrix Analysis. Cambridge University Press.