Sparse Estimation of Conditional Graphical Models With Application to Gene Networks

J Am Stat Assoc. 2012 Jan 1;107(497):152-167. doi: 10.1080/01621459.2011.644498.

Abstract

In many applications the graph structure in a network arises from two sources: intrinsic connections and connections due to external effects. We introduce a sparse estimation procedure for graphical models that is capable of isolating the intrinsic connections by removing the external effects. Technically, this is formulated as a conditional graphical model, in which the external effects are modeled as predictors, and the graph is determined by the conditional precision matrix. We introduce two sparse estimators of this matrix using the reproduced kernel Hilbert space combined with lasso and adaptive lasso. We establish the sparsity, variable selection consistency, oracle property, and the asymptotic distributions of the proposed estimators. We also develop their convergence rate when the dimension of the conditional precision matrix goes to infinity. The methods are compared with sparse estimators for unconditional graphical models, and with the constrained maximum likelihood estimate that assumes a known graph structure. The methods are applied to a genetic data set to construct a gene network conditioning on single-nucleotide polymorphisms.

Keywords: Conditional random field; Gaussian graphical models; Lasso and adaptive lasso; Oracle property; Reproducing kernel Hilbert space; Sparsistency; Sparsity; von Mises expansion.