Sample covariance matrix estimator low dimensional setting. Estimation of covariance matrices then deals with the question of how to approximate the actual covariance matrix on the basis of a sample from the multivariate distribution. Estimating structured high dimensional covariance and precision matrices. Estimating structured highdimensional covariance and. Many applications require precise estimates of highdimensional covariance matrices. Sparse estimation of highdimensional covariance matrices. In section 2 the problem formulation is introduced. Spatial data are encountered in a wide range of disciplines.
Taking advantage of the connection between multivariate linear regression and entries of the inverse covariance matrix, we propose an estimating procedure that can effectively exploit such. In this paper, we consider the speci c high dimensional problem of recovering the covariance matrix of a zeromean gaussian random vector, under the low dimensional structural constraint of sparsity of the inverse covariance, or concentration matrix. The book relies heavily on regressionbased ideas and interpretations to connect and unify many existing methods and. Pdf highdimensional covariance matrix estimation in. Mar 27, 2018 the following proposition lays the foundations for the analysis of highdimensional covariance or precision matrix estimation with infinite kurtosis.
Inverse covariance estimation for high dimensional data in linear time and space. The assumed framework allows for a large class of multivariate linear processes including vector autoregressive moving average varma models of growing dimension and spiked covariance models. Estimating high dimensional covariance matrices and its applications. It is a common practice in high dimensional statistical inference, including compressed sensing and covariance matrix estimation, to impose structural assumption such as sparsity on the target in order to e ectively estimate the quantity of. High dimensional covariance estimation focuses on the methodologies based on shrinkage, thresholding, and penalized likelihood with applications to gaussian graphical models, prediction, and meanvariance portfolio management. An overview on the estimation of large covariance and.
In this paper, we describe and study a class of linear shrinkage estimators of the covariance matrix that is wellsuited for high dimensional matrices, has a rather wide domain of applicability, and is rooted into the gaussian conjugate framework of chen 1979. While under the highdimensional covariance matrices estimation framework. Abstractthe determinant of the covariance matrix for highdimensional data plays an important role in statistical inference and decision. Covariance and precision matrices play a central role in summarizing linear relationships among variables. Estimating highdimensional covariance matrices is intrinsically challenging. Highdimensional covariance estimation by minimizing. High dimensional variable selection and covariance matrix estimation. Highdimensional covariance estimation provides accessible and comprehensive. Due to the statistical and computational challenges with high dimensionality, little work has been proposed in the literature for estimating the determinant of highdimensional. Finally, we present a novel method for estimating higher moments of multivariate elliptical distributions. Fast and positive definite estimation of large covariance.
Note that this is also equivalent to recovering the underlying graph structure of a sparse gaussian markov random field gmrf. Most available methods and software cannot smooth covariance matrices of dimension \j500\. Nuclearnorm penalization and optimal rates for noisy lowrank matrix completion koltchinskii, vladimir, lounici, karim, and tsybakov, alexandre b. We propose two fast covariance smoothing methods and associated software that scale up linearly with the number of observations per function. We propose a novel framework to first estimate the initial joint covariance matrix of the observed data and the factors, and then use it to recover the covariance matrix of the observed data. We examine covariance matrix estimation in the asymptotic. We propose to use a fundamental result in random matrix theory, the marcenkopastur equation, to better estimate the eigenvalues of large dimensional covariance matrices. Inverse covariance estimation for highdimensional data in. High dimensional covariance matrix estimation using a factor model.
The abundance of high dimensional data is one reason for the interest in the problem. The standard estimator is the sample covariance matrix, which is conceptually simple, fast to compute and has favorable properties in the limit of in nitely many observations. Another reason is the ubiquity of the covariance matrix in data analysis tools. High dimensional covariance matrix estimation using a factor. Estimation of covariance, correlation and precision matrices. Many applications require precise estimates of high dimensional covariance matrices. Estimating a highdimensional covariance matrix and. High dimensional inverse covariance matrix estimation via. We propose a simple procedure computationally tractable in high dimension and that does not require imputation of the missing data. With the increasing complex data model being investigated, for example in climate sciencebenestad et al. Taking advantage of the connection between multivariate linear regression and entries of the inverse covariance matrix, we propose an estimating procedure that can effectively exploit such sparsity.
In statistics, sometimes the covariance matrix of a multivariate random variable is not known but. Estimating covariance structure in high dimensions ashwini maurya michigan state university east lansing, mi, usa thesis director. Rate optimal estimation for high dimensional spatial. Jun 27, 2014 we propose two fast covariance smoothing methods and associated software that scale up linearly with the number of observations per function. Classical multivariate statistics are based on the assumption that the number of parameters is fixed and the number of observations is large. Estimating high dimensional covariance matrices and its. The ultra high dimensional setting where pn nis important due to many contemporary applications. Existing estimators typically require a good estimate of the precision matrix, which assumes strict structural assumptions on the covariance or the precision matrix when data is high dimensional. High dimensional inverse covariance matrix estimation via linear programming. Discussion of large covariance estimation by thresholding principal orthogonal complements.
High dimensional covariance estimation by minimizing l1. High dimensionality comparable to sample size is common in many statistical problems. Dissertation, department of mathematics, princeton university. The limitations of the sample covariance matrix are discussed. However, in the high dimensional setting, including too many or irrelevant controlling variables may distort the results. Journal of american statistical association, 1, 12681283. Battey department of mathematics, imperial college london, 545 huxley building. Estimating the structure of this p by p matrix usually comes at a computational cost of op3 time and op2 memory for solving a nonsmooth logdeterminant minimization problem, thus for large p both storage and computation. We examine covariance matrix estimation in the asymptotic framework that the dimensionality p tends to 1 as the sample size n increases.
Beijing university of chinese medicine a thesis submitted for the degree of doctor of philosophy department of statistics and applied probability national university of singapore 2017 supervisor. High dimensional covariance estimation provides accessible and comprehensive coverage of. High dimensional multilevel covariance estimation and kriging. Estimating covariance matrices is an important part of portfolio selection, risk management, and asset pricing. High dimensional covariance estimation by minimizing l1penalized logdeterminant divergence pradeep ravikumar, martin wainwright, bin yu, garvesh raskutti abstract. In this paper, we describe and study a class of linear shrinkage estimators of the covariance matrix that is wellsuited for high dimensional matrices, has a rather wide domain of applicability. Software for computing a covariance shrinkage estimator is available in r. This paper studies methods for testing and estimating changepoints in the covariance structure of a high dimensional linear time series.
A state space model approach to integrated covariance matrix estimation with high frequency data. For example, in portfolio allocation and risk management, the number of stocks p, which is. Highdimensionality, graphical model, approximate factor model, principal components, sparse matrix, lowrank matrix, thresholding, heavytailed, elliptical distribution, rank based methods. In this paper, we study the problem of high dimensional covariance matrix estimation with missing observations. Highdimensional covariance estimation provides accessible and comprehensive coverage of. Spectrum estimation for large dimensional covariance. Highdimensional covariance estimation can be classified into two main categories, one.
However, with the increasing abundance of high dimensional datasets, the fact that the number of parameters to estimate grows with the square of the dimension suggests that it is important to have robust alternatives to the standard sample covariance matrix estimator. Popular regularization methods of directly exploiting sparsity are not directly applicable to many financial problems. Sparse covariance matrix estimation in highdimensional deconvolution belomestny, denis, trabs, mathias, and tsybakov, alexandre b. Covariance estimation for high dimensional data vectors using the sparse matrix transform guangzhi cao charles a. Regularized estimation of highdimensional covariance matrices. Highdimensional sparse inverse covariance estimation using. Methods for estimating sparse and large covariance matrices covariance and correlation matrices play fundamental roles in every aspect of the analysis of multivariate data collected from a variety of fields including business and economics, health care, engineering, and environmental and physical sciences. Estimation of covariance, correlation and precision. Our approach transforms high dimensional illconditioned covariance matrices to numerically stable multilevel covariance matrices without compromising accuracy. Estimating a highdimensional covariance matrix and its inverse, the precision matrix, is becoming a crucial problem in many applications including functional magnetic resonance imaging, analysis of gene expression arrays, risk management and portfolio allocation. The abundance of highdimensional data is one reason for the interest in the problem. The dissertation makes contributions in two main areas of covariance estimation. Testing and estimating changepoints in the covariance matrix. Advances in highdimensional covariance matrix estimation.
Estimating covariance or precision matrix is more challenging in the multivariate case as the positivedefiniteness constraint on the covariance matrix and high dimensionality where now the number of parameters grows quadratically with the number of outcomes and time points. Rp, estimate both its covariance matrix, and its inverse covariance or concentration matrix. Highdimensional data are often most plausibly generated from distributions with complex structureandleptokurtosisinsomeorallcomponents. Another relation can be made to the method by rutimann. Robust highdimensional volatility matrix estimation for highfrequency factor model. Our focus is on estimating these matrices when their dimension is large relative to the number of observations. The variancecovariance matrix plays a central role in the inferential theories of highdimensional factor models in finance and economics. Highdimensional sparse inverse covariance estimation. Many of the classical techniques perform poorly, or are degenerate, in highdimensional situations. Regularized estimation of precision matrix for high. In this paper, we study the problem of highdimensional covariance matrix estimation with missing observations. This estimator shrinks stoward the covariance matrix implies by the capm model.
The minimax upper bound is obtained by constructing a class of tapering esti. Highdimensional covariance matrix estimation in approximate factor models article pdf available in the annals of statistics 396. Estimating structured highdimensional covariance and precision. High dimensional inverse covariance matrix estimation via linear. Minimax rates of convergence for estimating several classes of structured covariance and precision matrices, including bandable, toeplitz, sparse, and sparse spiked covariance matrices as well as. High dimensional covariance matrix estimation using a. For example, in portfolio allocation and risk management, the number of stocks p, which is typically of the same order as the sample size n, can well be in the order of hundreds. The following proposition lays the foundations for the analysis of highdimensional covariance or precision matrix estimation with infinite kurtosis. This paper considers the problem of estimating a high dimensional inverse covariance matrix that can be well approximated by sparse matrices. Optimal rates of convergence for covariance matrix estimation.
Fast covariance estimation for highdimensional functional. Highdimensional sparse inverse covariance estimation using greedy methods ali jalali, chris johnson, pradeep ravikumar abstract. By jianqing fan, yingying fan and jinchi lv princeton university august 12, 2006 high dimensionality comparable to sample size is common in many statistical problems. Rp, we study the problem of estimating both its covariance matrix, and its inverse covariance or concentration matrix.
Inspired by ideas of random matrix theory, we also suggest a change of point of view when thinking ab out estimation of highdimensional vectors. Covarianceandprecisionmatricesprovide a useful summary of such structure, yet the performance of popular matrix estimators typically hinges upon a subgaussianity assumption. Estimating a highdimensional covariance matrix and its inverse, the precision matrix, is becoming a crucial problem in many applications including functional magnetic resonance imaging, analysis of gene expression arrays, risk. Covariance estimation for high dimensional data vectors using. Aggregation of nonparametric estimators for volatility matrix. Problem 3 can therefore be reformulated as a linear program just like the.
Large scale inverse covariance estimation center for big. For sparsity regularization, the lasso penalty is popular and convenient due to its convexity but has a bias problem. I its eigenvalues are well behaved and good estimators of their population counterparts. For the semidefinite program 24, yuan and lin 2007 solved the problem using interior. The sample covariance matrix scm is an unbiased and efficient estimator of the covariance matrix if the space of covariance matrices is viewed as an extrinsic convex cone in r p. It has many real applications including statistical tests and information theory. Clifford lam department of statistics, london school of economics and political science.
I it is invertible and extensively used in linear models and time series analysis. Estimating a high dimensional covariance matrix and its inverse, the precision matrix, is becoming a crucial problem in many applications including functional magnetic resonance imaging, analysis of gene expression arrays, risk management and portfolio allocation. The structure of gaussian graphical models is directly connected to the sparsity of its pxp inverse covariance matrix. Simple cases, where observations are complete, can be dealt with by using the sample covariance matrix. Regularized estimation of high dimensional covariance matrices by yilun chen a dissertation submitted in partial ful llment of the requirements for the degree of doctor of philosophy electrical engineering. In many cases, the number of parameters, p, exceeds the number of observations, n. However, the sample covariance matrix is an inappropriate estimator in high dimensional settings. Highdimensional covariance matrix estimation in approximate. Matlab software for disciplined convex programming, version 2.
Estimating covariance or precision matrix is more challenging in the multivariate case as the positivedefiniteness constraint on the covariance matrix and highdimensionality where now the number of parameters grows quadratically with the number of outcomes and time points. High dimensional sparse inverse covariance estimation using greedy methods ali jalali, chris johnson, pradeep ravikumar abstract. Estimating structured highdimensional covariance and precision matrices. Systems in the university of michigan 2011 doctoral committee. Robust estimation of highdimensional covariance and. In this paper, we study robust covariance estimation under the approximate factor model with observed factors. Law of log determinant of sample covariance matrix and. High dimensional covariance matrix estimation lse statistics. The minimax risk of estimating the covariance matrix over the class p. We examine covariance matrix estimation in the asymptotic framework.
1145 1139 563 1104 193 1322 163 44 273 68 112 436 965 989 254 90 679 1222 606 1499 17 1237 1268 1189 1018 1024 455 991 1442 849 1074 338 546 892 674 905 768 841 218 224 227 567 355 742 123 980 1262