Nonparametric Regularized Regression for Phenotype-Associated Taxa Selection and Network Construction with Metagenomic Count Data

Abstract

We use a metagenomic approach and network analysis to investigate the relationships between phenotypes across taxa under different environmental conditions. The network structure of taxa can be affected by the disease-associated environmental conditions. In addition, taxa abundance is differentiated under conditions. Therefore, knowing how the correlation or relative abundance changes with these factors would be of great interest to researchers. We develop a nonparametric regularized regression method to construct taxa association networks under different clinical conditions. We let the coefficients be unknown functions of the environmental variable. The varying coefficients are estimated by using regression splines. The proposed method is regularized with concave penalties, and an efficient group descent algorithm is developed for computation. We also apply the varying coefficient model to estimate taxa abundance to see how it changes across different environmental conditions. Moreover, for conducting inference, we propose a bootstrap method to construct the simultaneous confidence bands for the corresponding coefficients. We use different simulated designs and a real data set to demonstrate that our method can identify the network structures successfully under different environmental conditions. As such, the proposed method has potential applications for researchers to construct differential networks and identify taxa.

1. Introduction

The microbiota live and interact with a host, and provide a wide range of metabolic functions. In the human body, the change of microbial community structure may affect our health and cause disease. The interactions among disease-associated taxa play critical roles in determining metabolic functions. Hence, identifying disease-associated taxa and exploring or predicting interactive network structures for disease-associated taxa are essential parts of studies in metagenomic research. To identify differentially abundant features, two popular statistical approaches are usually considered, including statistical testing (Singleton et al., 2001; Schloss and Handelsman, 2005; White et al., 2009; Alekseyenko et al., 2013) and sparse learning. Relevant examples of the latter include the regularized multinomial logistic regression model (Tanaseichuk et al., 2013) and the sparse weighted distance learning method (Liu et al., 2011).

One popular method for constructing taxa association networks is to use graphical models. In the network graph, each vertex represents a taxon and two vertices are connected if the two taxa are partially correlated. One of the most commonly used graphical model is the Gaussian graphical model (Krämer et al., 2009), which is applied to continuous data by assuming that the data come from a multivariate Gaussian distribution. Besides the Gaussian graphical model, some other models are also developed for specific data types. For example, the local Poisson graphical model was proposed to deal with sequencing data (Allen et al., 2013) and a generalized linear model was used to fit the discrete gene expression data (Zhang and Mallick, 2013).

One appealing feature of using the Gaussian graphical model is that the construction of the association networks can be explored by estimating coefficients in a linear regression model, that is., a regularized regression model can be established by using one gene as the response and other genes as the predictors. Meinshausen and Bühlmann (2006) proposed a neighborhood selection method with an \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $${l_1}$$ \end{document} penalty, which is computationally fast in the high-dimensional setting. For the Gaussian graphical model, the inverse of the covariance matrix can be used to represent the graphical structure where the estimation is achieved through maximizing the regularized log-likelihood function (Yuan and Lin, 2007). Friedman et al. (2008) developed the graphical lasso (glasso) algorithm through coordinate descent to estimate the sparse inverse covariance matrix.

In practice, the relationship between taxa may change under different clinical conditions. Determining the network structure under environmental changes is a big challenge. Recently, Liu et al. (2015) developed a multilevel graphical model for the construction of the association network, which can build the network and select differentiated taxon under different clinical conditions simultaneously. The clinical conditions are coded as a binary dummy variable. However, in many studies, the clinical conditions are continuously measured.

In this article, we propose a nonparametric regression method by letting the parameters be unknown functions of the phenotypic variable, which is used to represent the clinical conditions, so that the network structure or relative abundance is allowed to change under different conditions. Regression splines are employed for estimation of the unknown functions. For network construction, we also use regularized regression. Instead of the \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $${l_1}$$ \end{document} penalty, we choose to use the concave penalties including the smoothly clipped absolute deviation (SCAD) penalty (Fan and Li, 2001) and the minimax concave penalty (MCP) (Zhang, 2010) for the purpose of reducing estimation bias. Group SCAD and group MCP (Huang et al., 2012) are, therefore, applied to the spline coefficients. A group descent algorithm is developed for fast and efficient computation. After gene selection, we propose a simultaneous confidence band for testing the shape of each unknown function to further explore the dynamic structure changing with the environmental conditions that the association network can have and for performing disease-associated taxa selection. We demonstrate that the proposed methods can successfully identify the biological important taxa and the dynamic network structure associated with the disease through simulation studies and a real metagenomic count data.

The rest of this article is organized as follows. The model, estimation methods, and computational algorithm are presented in Section 2. Our statistical inference procedure is presented in Section 2.5. Numerical simulations for different models and a real data analysis are given in Section 3. Some concluding remarks are given in Section 4.

2. Methods

2.1. Data transformation

Let \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $${y_i}$$ \end{document} be the phenotype in sample \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $$i$$ \end{document} for \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $$i = 1 , \ldots , n$$ \end{document} . Our goal is to identify phenotype-associated taxa and construct the dynamic network structure of those species given different measured values of the phenotype. For each sample, we have multiple metagenomic count features, including the number of 16S rRNA clones assigned to a specific taxon, or the number of shotgun reads mapped to a specific biological pathway or subsystem. Let \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $${x_{ij}}$$ \end{document} be the total number of reads of feature \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $$j$$ \end{document} in sample \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $$i$$ \end{document} for \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $$i = 1 , \ldots , n$$ \end{document} and \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $$j = 1 , \ldots , p$$ \end{document} . The data structure is shown as follows: \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} \begin{align*} {\mathbb X} = \left[ { \begin{matrix} {{x_{11}}} & {{x_{12}}} & \cdots & {{x_{1p}}} \\ {{x_{21}}} & {{x_{22}}} & \cdots & {{x_{2p}}} \\ \vdots & \vdots & \ddots & \vdots \\ {{x_{n1}}} & {{x_{n2}}} & \cdots & {{x_{np}}} \\ \end{matrix} } \right] , \ { \rm{and}} \ {\bf y} = \left[ { \begin{matrix} {{y_1}} \\ {{y_2}} \\ \vdots \\ {{y_n}} \\ \end{matrix} } \right]. \end{align*} \end{document}

Write \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $${ \bf{y = ( }}{y_1} , \ldots , {y_n}{ ) ^T}$$ \end{document} , \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $${{ \bf{x}}_i} = ( {x_{i1}} , \ldots , {x_{ip}}{ ) ^T}$$ \end{document} , and \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $${\mathbb X}{ \bf{ = ( }}{{ \bf{x}}_1} , \ldots , {{ \bf{x}}_n}{ ) ^T}$$ \end{document} . To adjust for the read-depth differences in sequencing, the metagenomic count matrix \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $${\mathbb X}$$ \end{document} is first transformed into a proportion matrix \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $${\mathbb P} { \bf{ = [ }}{p_{ij}}{ \bf{ ] }}$$ \end{document} , where \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $${p_{ij}} = {x_{ij}} / ( \sum \nolimits_j {x_{ij}} )$$ \end{document} and thus \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $$\sum \nolimits_j {p_{ij}} = 1$$ \end{document} . To overcome variance heterogeneity with compositional data, the proportion matrix \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $${\mathbb P}$$ \end{document} is then converted into a normally distributed matrix \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $${\mathbb Z}{ \bf{ = ( }}{{ \bf{z}}_1} , \ldots , {{ \bf{z}}_n}{ ) ^T}$$ \end{document} , where \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $${{ \bf{z}}_i} = ( {z_{i1}} , \ldots , {z_{ip}}{ ) ^T}$$ \end{document} with the arcsine or log-ratio transformation. We take \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $${z_{ij}} = 2 \arcsin ( \sqrt {{p_{ij}}} )$$ \end{document} for the arcsine transformation. For the log-ratio transformation, there may be zeros in the proportion matrix. Thus, we first replace \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} \begin{align*} {p_{ij}} = \left\{ {\begin{matrix} \quad\quad\quad\quad{{ \delta _{ij}}}\hfill & {{ \rm{if}} \; {p_{ij}} = 0} \\ { ( 1 - \sum \nolimits_{{j^{\prime} } \vert {p_{i{j^{\prime} }}} = 0} { \delta _{i{j^{\prime} }}} ) {p_{ij}}} & {{ \rm{if}} \; {p_{ij}} > 0}\\ \end{matrix}}, \right. \end{align*} \end{document}

2.2. Model

Given normalized \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $${\mathbb Z}$$ \end{document} and the phenotype \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $${ \bf{y}}$$ \end{document} , the task is to identify taxa and network structure of taxa given different values of the phenotype. The association network is constructed through Gaussian graphical model. We assume that \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $${{ \bf{z}}_i}$$ \end{document} and \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $${y_i}$$ \end{document} are random samples from the same distributions of \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $${ \bf{Z = ( }}{Z_1} , \ldots , {Z_p}{ ) ^T}$$ \end{document} and \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $$Y$$ \end{document} , respectively. Given \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $$Y = y$$ \end{document} , the \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $$p$$ \end{document} -dimensional random vector \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $${ \bf{Z}}$$ \end{document} follows a multivariate normal distribution such that \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} \begin{align*} Z | Y = y \sim N ( {\mu}(y), \sum (y)), \end{align*} \end{document}

so that the conditional mean and variance of \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $${ \bf{Z}}$$ \end{document} depend on the observed value of \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $$Y$$ \end{document} . For an arbitrary subset \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $${ \cal S} \in \{ 1 , \ldots , p \} $$ \end{document} , denote \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $${{ \bf{Z}}_{ \cal S}}$$ \end{document} as the subvector of \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $${ \bf{Z}}$$ \end{document} corresponding to \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $${ \cal S}$$ \end{document} . For given \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $$Y = y$$ \end{document} , assuming that \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $$\sum ( y )$$ \end{document} is nonsingular, it holds for \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $$j , {j^{ \prime}} \in \{ 1 , \ldots , p \} $$ \end{document} and \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $$j \ne {j{^{ \prime}}}$$ \end{document} that \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $${Z_j}$$ \end{document} is conditionally independent of \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $${Z_{{j^{ \prime}}}}$$ \end{document} given \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $${{ \bf{Z}}_{ \{ 1 , \ldots , p \} / \{ j , {j^{ \prime}} \} }}$$ \end{document} if and only if \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $${k_{j{j^{ \prime}}}} ( y ) = 0$$ \end{document} (Lauritzen, 1996), where \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $${ \bf{K}} ( y ) = \{ {k_{j{j^{ \prime}}}} ( y ) \} = \{ \sum ( y{ ) \} ^{ - 1}}$$ \end{document} . Hence, the conditional independence structure can be represented by a graphical model \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $${ \cal G} = ( \Gamma , E )$$ \end{document} , where \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $${ \Gamma}= \{ 1 , \ldots , p \} $$ \end{document} is the set of nodes and \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $$E$$ \end{document} is the set of edges in \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $$\Gamma \times \Gamma$$ \end{document} . We assume that the pair \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $$( j , {j^{ \prime}} )$$ \end{document} is not contained in the edge set if \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $${k_{j{j^{ \prime}}}} ( y ) = 0$$ \end{document} for all \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $$y \in { \cal D}$$ \end{document} , where \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $${ \cal D}$$ \end{document} is the domain of \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $$Y$$ \end{document} . Without loss of generality, let \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $${ \cal D} = [ a , b ]$$ \end{document} . For each node \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $$j \in \Gamma$$ \end{document} , define \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} \begin{align*} {\bf \theta} ^{,j} ( y ) = { \theta _{{j^{ \prime}}}^{\,j} ( y ) , {j^{ \prime}} \ne j{ } ^T} = { \rm{argmin}}{ _{ \theta \in {R^{p - 1}}}}E{ ( {Z_j} - { \bf{\theta} ^T}{{ \bf{Z}}_{ - j}} \vert Y = y ) ^2} , \end{align*} \end{document}

Thus, to identify the association among taxa, we need to select variables from the neighbors of a node. Also, studying the change of \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $$\mu ( y )$$ \end{document} can be used to identify differentiated features under different conditions.

2.3. Estimation

Let us first take a look at how to select nonzero elements in \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $${ \bf{K}} ( y )$$ \end{document} . We assume that \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $$\theta _{{j^{ \prime}}}^{\,j} ( y )$$ \end{document} and \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $${ \mu _j} ( y )$$ \end{document} are smooth functions of \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $$y$$ \end{document} for \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $$j = 1 , \ldots , p$$ \end{document} , and approximate each \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $$\theta_{{j^{ \prime}}}^{\,j} ( \cdot )$$ \end{document} and \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $${ \mu _j} ( y )$$ \end{document} by using B-spline basis functions. Let \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $$a = {t_0} < {t_1} < \cdots < {t_{{N_n}}} < b = {t_{{N_n} + 1}}$$ \end{document} be a partition of \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $$[ a , b ]$$ \end{document} satisfying \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $$\mathop { \max } \nolimits_{0 \le k \le N} \vert {t_{k + 1}} - {t_k} \vert / \mathop { \min } \nolimits_{0 \le k \le N} \vert {t_{k + 1}} - {t_k} \vert \le M$$ \end{document} for some constant \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $$M > 0$$ \end{document} , where \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $${N_n}$$ \end{document} increases with the sample size \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $$n$$ \end{document} . Let \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $$B ( y ) = \{ {B_k} ( y ):1 \le k \le {J_n}{ \} ^T}$$ \end{document} be the qth order B-spline basis functions given on page 87 of de Boor (2001), where \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $${J_n} = {N_n} + q$$ \end{document} . First, we consider the estimation of \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $$\theta _{{i^{\prime} }}^i ( \cdot )$$ \end{document} . From the result by de Boor (2001), each function \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $$\theta _{{i^{ \prime}}}^i ( \cdot )$$ \end{document} can be approximated well by a spline function such that \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $$\theta _{{j^{ \prime}}}^{\,j} ( y ) \approx B{ ( y ) ^T} {\eta} _{{j^{ \prime}}}^{\,j}$$ \end{document} for some \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $${\eta} _{{j^{ \prime}}}^{\,j} \in {R^{{J_n}}}$$ \end{document} . Let \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $${{ \bf{z}}_{i , - j}} = \{ {z_{i{j^{\prime} }}} , {j^{ \prime}} \ne j{ \} ^T}$$ \end{document} and \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $${ \bf{w}}_i^{\,j} = {{ \bf{z}}_{i , - j}} \otimes B ( {y_i} )$$ \end{document} , where \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $$\otimes$$ \end{document} denotes the Kronecker product of two matrices. For any vector \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $${ \bf{a}}$$ \end{document} , denote \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $${\parallel} { \bf{a}}{\parallel}$$ \end{document} as its \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $${ \rm L}_2$$ \end{document} norm. The estimator of \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $${ {\eta} ^{\,j}} = \{ ( {\eta} _{{j^{ \prime}}}^{\,j}{ ) ^T} , {j^{ \prime}} \ne j{ \} ^T}$$ \end{document} can be obtained by minimizing \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} \begin{align*} \begin{split} {Q ( { \bf{\eta} ^{\,j}} ) = }& {{2^{ - 1}} \sum \nolimits_{i = 1}^n {{ ( {z_{ij}} - \sum \nolimits_{{j^{\prime} } \ne j} {z_{i{j^{\prime} }}}B{{ ( {y_i} ) }^T} \bf{\eta} _{{j^{\prime} }}^j{ \bf{ ) }}}^2}} \hfill \\ {} & { + n \sum \nolimits_{{j^{\prime} } \ne j} p ( {\parallel} \bf{\eta} _{{j^{\prime} }}^{\,j} {\parallel} , \lambda ) } \hfill \\ = &{{2^{ - 1}} \sum \nolimits_{i = 1}^n {{ ( {z_{ij}} - {{ ( { \bf{\eta} ^{\,j}} ) }^T}{ \bf{w}}_i^{\,j}{ \bf{ ) }}}^2} + n \sum \nolimits_{{j^{\prime} } \ne j} p ( {\parallel} \bf{\eta} _{{j^{\prime} }}^{\,j} {\parallel} , \lambda ) } \\ = & {{2^{ - 1}} {\parallel} {{ \bf{Z}}_j} - {{ \bf{W}}^j}{ \bf{\eta} ^j}{ {\parallel} ^2} + n \sum \nolimits_{{j^{\prime} } \ne j} {p_ \lambda } ( {\parallel} \bf{\eta} _{{j^{\prime} }}^{\,j} {\parallel} ) , } \hfill \\ \end{split} \tag{2.1} \end{align*} \end{document}

The MCP has the form \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} \begin{align*} {p_ \lambda } ( t ) = \lambda \int_0^t { ( 1 - x / ( \gamma { { \lambda }} ) ) _ + }dx , \end{align*} \end{document}

for some \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $$\gamma > 1$$ \end{document} and the SCAD penalty is given as \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} \begin{align*} {p_ \lambda } ( t ) = \lambda \int_0^t \min \{ 1 , ( \gamma - x / { { \lambda }}{ ) _ + } / ( \gamma - 1 ) \} dx , \end{align*} \end{document}

for some \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $$\gamma > 2$$ \end{document} . The concave penalties enjoy the sparsity as the \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $${ \ell _1}$$ \end{document} penalty that it can automatically set the small estimated parameters to zero. More importantly, it has an unbiased property where it does not shrink large estimated parameters (Fan and Li, 2001).

Similarly, each smooth mean function \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $${ \mu _j} ( y )$$ \end{document} can be approximated by \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $${ \mu _j} ( y ) \approx B{ ( y ) ^T}{ \alpha _{ \textbf{\textit{j}}}}$$ \end{document} , for some unknown B-splines coefficients \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $${ \alpha _{ \textbf{\textit{j}}}} \in {R^{{J_n}}}$$ \end{document} . We minimize \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} \begin{align*} S = {\parallel} {{ \bf{Z}}_j} - B{ ( y ) ^T}{ \alpha _{ \textbf{\textit{j}}}}{ {\parallel} ^2} \end{align*} \end{document}

for \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $$j = 1 , \ldots , p$$ \end{document} . Note that the estimator of \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $${ \alpha _j}$$ \end{document} has an explicit solution, which is \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} \begin{align*} { \widehat {\alpha _j}} = [ B ( y ) B{ ( y ) ^T}{ ] ^{ - 1}}B ( y ) {{ \bf{Z}}_j}. \end{align*} \end{document}

Therefore, given \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $${ \widehat \alpha _j}$$ \end{document} , the estimated value of the mean function \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $${ \mu _j} ( \cdot )$$ \end{document} at a given point \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $${y_0} \in [ a , b ]$$ \end{document} is given by \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} \begin{align*} \begin{matrix} {{{ \hat \mu }_j} ( {y_0} ) } \hfill & { = B{{ ( {y_0} ) }^T}{{ \widehat \alpha }_j}} \hfill \\ {} \hfill & { = B{{ ( {y_0} ) }^T}{{ [ B ( y ) B{{ ( y ) }^T} ] }^{ - 1}}B ( y ) {{ \bf{Z}}_j}.} \hfill \\ \end{matrix} \tag{2.2} \end{align*} \end{document}

2.4. Implementation

We can apply the Newton–Raphson algorithm to obtain the minimizer of the objective function \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $$Q ( { {\eta} ^j} )$$ \end{document} given in Equation (2.1). At each step, the concave penalty function can be locally approximated by a quadratic function as (Fan and Li, 2001)

The quadratic approximation already presented is inefficient for large-scaled models. Thus, for the high-dimensional data, we apply the group descent algorithm in Breheny and Huang (2015), which is efficient and can improve computational speed. The algorithm is presented as follows.

Group descent algorithm

1. For \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $$j^{\prime} \ne j$$ \end{document} , let

2. Repeat Step 1 until \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $${ \bf{r}}$$ \end{document} converges.

In the said algorithm, for SCAD penalty, \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} \begin{align*} G( x , { { \lambda }} , \gamma ) = \left\{ { \begin{matrix} {ST ( x , { { \lambda }} ) \quad } & {{ \rm{if}} \ \vert x \vert \le 2{ { \lambda }}} \\ {{{ST ( x , \gamma { { \lambda }} / ( \gamma - 1 ) ) } \over {1 - 1 / ( \gamma - 1 ) }} \quad } & {{ \rm{if}} \ 2{ { \lambda }} < \vert x \vert \le \gamma { { \lambda }}} , \\ { {\parallel} x {\parallel} \quad } & {{ \rm{if}} \ \vert x \vert > \gamma { { \lambda ,}}}\end{matrix}} \right. \end{align*} \end{document}

for MCP, \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} \begin{align*} G( x , { { \lambda }} , \gamma ) = \left\{ { \begin{matrix} {{{ST ( x , { { \lambda }} ) } \over {1 - 1 / \gamma }} \quad } & {{ \rm{if}} \ \vert x \vert \le \gamma { { \lambda }}} , \\ {x \quad } & {{ \rm{if}} \ \vert x \vert > \gamma { { \lambda ,}}} \end{matrix} } \right. \end{align*} \end{document}

which is called soft-thresholding operator (Donoho and Johnstone, 1994).

2.5. Simultaneous confidence band

In addition to obtaining the estimators, the construction of simultaneous confidence bands for the coefficient functions is also very important, which allows us to make inference for the shape of the function. In this work, we propose a bootstrap simultaneous confidence band to test whether coefficient functions are zero or not. Our approach is based on resampling from residual terms, where the confidence bands are constructed by using wild bootstrapped samples. In the following, we define \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $${ \hat \epsilon _{ij}} = {z_{ij}} - { \hat z_{ij}}$$ \end{document} for a fixed \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $$j$$ \end{document} and \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $$1 \le i \le n$$ \end{document} , where \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $${ \hat z_{ij}}$$ \end{document} is the fitted value obtained from Equation (2.2) or in Section 2.4. The regular residual bootstrap method requires that the error terms are independent and identically distributed, which is almost impossible in practice. We thus apply the wild bootstrap method (Hardle and Mammen, 1993). The \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $$b$$ \end{document} th bootstrapped samples are \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $${z_{ij , b}} = { \hat z_{ij}} + {v_{i , b}}{ \hat \epsilon _{ij}}$$ \end{document} for \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $$1 \le i \le n$$ \end{document} , where \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $${ \{ {v_{i , b}} \} _{1 \le i \le n}}$$ \end{document} are i.i.d samples that have the following two-point distribution \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} \begin{align*} {v_{i , b}} = \left\{ { \begin{matrix} { ( 1 - \sqrt 5 ) / 2 \quad { \rm{with}}\ { \rm{probability}} \quad ( 5 + \sqrt 5 ) / 10} \hfill & {} \hfill \\ { ( 1 + \sqrt 5 ) / 2 \quad { \rm{with}}\ { \rm{probability}} \quad ( 5 - \sqrt 5 ) / 10} \hfill & {} \hfill \\ \end{matrix} } \right. \end{align*} \end{document}

Repeat the estimation procedure \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $${N_b}$$ \end{document} times. Note that \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $${N_b}$$ \end{document} is usually set to be a large integer. In simulations, we let \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $${N_b} = 1000$$ \end{document} . We obtain the confidence band for \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $$\hat \theta _{{j^{\prime} }}^{\,j} ( y )$$ \end{document} as follows: we first find the lower and upper \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $$100 ( 1 - \alpha / 2 )\%$$ \end{document} quantiles of the bootstrapped \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $$\hat \theta _{{j^{\prime} }}^{\,j} ( y )$$ \end{document} , say \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $$L_j^{{j^{\prime} }} ( y )$$ \end{document} and \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $$U_j^{{{\,j}^{\prime} }} ( y )$$ \end{document} , then denote the wild bootstrap \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $$( 1 - \alpha ) 100 \%$$ \end{document} confidence band for \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $$\theta _{{j^{\prime} }}^{\,j} ( y )$$ \end{document} as \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $$[ {L^ * } ( y ) , {U^ * } ( y ) ]$$ \end{document} , and define \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} \begin{align*} \begin{matrix} {{L^*} ( y ) } \; { =\; \hat \theta _{{j^{\prime} }}^{\,j} ( y ) + ( L_j^{j^{\prime} } ( y ) - \hat \theta _{{j^{\prime} }}^{\,j} ( y ) ) M ( \alpha ) } \hfill \\ {{U^*} ( y ) }\; { =\; \hat \theta _{{j^{\prime} }}^{\,j} ( y ) + ( U_j^{{\,j}^{\prime} } ( y ) - \hat \theta _{{j^{\prime} }}^{\,j} ( y ) ) M ( \alpha ) , } \hfill \\ \end{matrix} \end{align*} \end{document}

where \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $$M ( \alpha ) = z_{1 - \alpha / 2}^{ - 1} \sqrt { \chi _{2 , 1 - \alpha / {J_n}}^2}$$ \end{document} , which is the inflation factor from pointwise confidence intervals to uniform confidence bands as given by Yang (2008), in which \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $$\chi _{t , \tau }^2$$ \end{document} is the \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $$\tau$$ \end{document} -quantile of the chi-square distribution with \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $$t$$ \end{document} degrees of freedom. We also use a similar way to construct the confidence band for \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $${ \mu _j} ( y )$$ \end{document} . Besides, we rank taxa by using the \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $$p$$ \end{document} value to see whether the abundance of taxa is unchanged under different conditions. Let \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $${ \hat \sigma _{j , b}} ( y )$$ \end{document} be the estimated standard deviation from the bootstrap. Define the test statistic as \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $${T_n} = \mathop {{\rm sup}_{y}} | {\hat {\mu}} _{j} {(y)} - {\bar {z}_{j}} /\ {\hat {\sigma} _{j} , b} {(y)} \vert$$ \end{document} to test \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $${H_0}: { \mu _j} ( y ) = 0$$ \end{document} versus \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $${H_a}: { \mu _j} ( y ) \ne 0$$ \end{document} , where \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $${ \bar z_j} = \sum \nolimits_i {z_{ji}} / n$$ \end{document} . Then the \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $$p$$ \end{document} value \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $${p_v}$$ \end{document} is obtained by solving the equation \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $${T_n} = \sqrt { \chi _{2 , 1 - {p_v} / {J_n}}^2}$$ \end{document} . Hence, \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $${p_v} = {J_n} ( 1 - {F_{ \chi _2^2}} ( T_n^2 ) )$$ \end{document} , where \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $${F_{ \chi _2^2}} ( \cdot )$$ \end{document} is the cumulative distribution function (cdf) of the \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $$\chi _2^2$$ \end{document} distribution. Small \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $$p$$ \end{document} value implies that the jth taxon is significantly differentiated under different conditions (or different levels of phenotype).

3. Results

We compare the finite-sample performance for the constant coefficient model and the varying coefficient model (vcm) by using simulated data, and apply our method to analysis of a metagenomic count data. We also compare the proposed vcm method with the glasso method and the space method (space) in both the constant coefficient model and vcm. In general, glasso estimates the inverse covariance matrix and space only performs neighborhood selection. They are used for estimating constant coefficient models. We compare our vcm with these two methods since they provide good estimates and are widely used in finding graphical structure. Let \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $$\Theta$$ \end{document} be the inverse covariance matrix in Gaussian graphical model. The glasso (Yuan and Lin, 2007) is defined as \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} \begin{align*} { \hat \Theta _{glasso}} = \arg \mathop { \min } \limits_{ \Theta \succ 0} \{ - { \rm{log}} \vert \Theta \vert + { \rm{tr}} ( {S_n} \Theta ) + { { \lambda }} {\parallel} \Theta { {\parallel} _1} \} , \end{align*} \end{document}

3.1. Simulated data

To study the performance of our method, we simulated data from four different constant coefficient models and one vcm. We obtain the estimators for each model with \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $$p = 50$$ \end{document} and \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $$n = 200 , 300 , 400$$ \end{document} . For simulated data, we focus on the estimation of \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $$\Sigma ( y )$$ \end{document} since a good estimate of mean function is easy to obtain through spline method [Eq. (2.2)]. Thus, we set mean function \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $$\mu ( y )$$ \end{document} to be 0 for constant coefficient models, and sin function for the vcm. For each case, we generate random samples \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $${y_1} , \ldots , {y_n}$$ \end{document} . Then, we sample \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $${X_1} , \ldots , {X_n}$$ \end{document} from \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $$N ( \mu ( {y_1} ) , \sum ( {y_1} ) ) , \ldots , N ( \mu ( {y_n} ) , \sum ( {y_n} ) )$$ \end{document} and obtain the normalized data \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $${Z_1} , \ldots , {Z_n}$$ \end{document} . We use the four different measures of performance shown in Table 1 to compare the estimates of \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $$\Sigma ( {y_i} )$$ \end{document} with its true value: the average false positive (FP) number , average false negative (FN) number , average false positive rate (FPR), and average false negative rate (FNR), where FNR and FPR are defined as FPR = FP/N, FNR = FN/P. We call them average error rates since they are computed by taking the average of error rates across different levels of \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $$y$$ \end{document} . We run each model 100 times and find the average of these metrics. Note that in our setting “Non-NULL” means two taxa are not correlated under a given condition. For example, \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $$\hat \Sigma { ( {y_t} ) _{ij}}$$ \end{document} is nonzero implying taxon \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $$i$$ \end{document} and taxon \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $$j$$ \end{document} are correlated when phenotype at level \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $${y_t}$$ \end{document} .

Table 1.

Measures of Performance for Neighborhood Selection

	Prediction null	Non-null	Total
Truth
Null	TN	FP	N
Non-null	FN	TP	P

FN, false negative; FP, false positive; TN, true negative; TP, true positive.

To complete the simulations, covariance matrices or network structure for each model is computed using the desired structure of the adjacency matrices. Denote A as the adjacency matrix of a graph. We consider three different structures of \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $$A$$ \end{document} . These three structures are described as follows:

(a) An AR (autoregressive) model: \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $$A$$ \end{document} is a band matrix with bandwidth \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $$h$$ \end{document} , that is, \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $${A_{ij}} = 0$$ \end{document} for \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $$\vert i - j \vert > h$$ \end{document} .

(b) The scale-free network: \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $$A$$ \end{document} is generated by using Barabasi Albert model algorithm. The graph begins with an initial connected network of two nodes. New nodes are added to the network one at a time. Each new node is connected to one existing node with a probability that is proportional to the degree of the existing nodes we already have.

(c) The random matrix model: Each pair of off-diagonal elements of \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $$A$$ \end{document} is randomly set to be 1 with probability \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $$\pi$$ \end{document} . That is \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $${A_{ij}} = {A_{ji}} = 1$$ \end{document} with probability \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $$\pi$$ \end{document} and 0 otherwise.

We use \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $$AR ( 1 ) , AR ( 2 )$$ \end{document} , scale-free network, and random matrix model for the constant coefficient model and \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $$AR ( 1 )$$ \end{document} model for the vcm. The true graph models are depicted in Figure 1. The adjacency matrix of the scale-free network and random matrix model can be generated randomly by the huge R-package.

FIG. 1.

True graphical structures used in the simulations. (a) AR(1) model with bandwidth 1; (b) AR(2) model with bandwidth 2; (c) scale-free network; (d) random matrix model.

We choose SCAD as our penalty function and let \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $$\gamma = 3.7$$ \end{document} as suggested in Fan and Li (2001). For the three estimation methods, we also have to select the optimal tuning parameters, namely \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $$\lambda$$ \end{document} , \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $${ \lambda _{glasso}}$$ \end{document} , \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $${ \lambda _{space}}$$ \end{document} . We select the optimal value of the parameter for our method and glasso by Bayesian information criterion (BIC). Define \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} \begin{align*} BIC ( { { \lambda }} ) = n { \rm{log}} ( SS{E_ \lambda } / n ) + \hat d{f_ \lambda } \cdot { \rm{log}} ( n ) \cdot {C_n} , \end{align*} \end{document}

Define \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} \begin{align*} BIC ( {{ { \lambda }}_{glasso}} ) = - n { \rm{log}} \vert \hat \Theta \vert + n \cdot trace ( \hat \Theta {S_n} ) + k { \rm{log}} ( n ) , \end{align*} \end{document}

3.1.1. Constant coefficient model

In constant coefficient models, the covariance matrix does not depend on \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $$y$$ \end{document} , that is, \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $$\sum ( y ) = \sum$$ \end{document} and we let \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $$\mu ( {y_i} ) = 0$$ \end{document} , where \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $${y_i}$$ \end{document} are uniformly distributed in [0,1] for \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $$i = 1 , \ldots , n$$ \end{document} . We use \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $$AR ( 1 )$$ \end{document} model, \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $$AR ( 2 )$$ \end{document} model, scale-free network, and random matrix model to examine our method.

Example 1. The AR Model.

We consider two different inverse covariance matrices. The first one is an \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $$AR ( 1 )$$ \end{document} model and the corresponding inverse covariance matrix is defined as \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $${ ( { \sum \nolimits^{ - { \bf{1}}}} ) _{ij}} = - 0.4$$ \end{document} for \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $$\vert i - j \vert = 1$$ \end{document} ; 1 for \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $$i = j$$ \end{document} and 0 otherwise. Note that this inverse covariance matrix is a band matrix with bandwidth 1. The second model is an \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $$AR ( 2 )$$ \end{document} model with band matrix that has bandwidth 2, that is \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $${ ( { \sum \nolimits^{ - { \bf{1}}}} ) _{ij}} = - 0.4$$ \end{document} for \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $$\vert i - j \vert = 2$$ \end{document} ; 1 for \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $$i = j$$ \end{document} , and 0 otherwise. Table 2 shows the performance of the three methods. For the two \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $$AR$$ \end{document} models, the false negatives equal to zero for all cases by the three methods. This indicates that these three methods can successfully find the true nonzero components in this simple setting. The space method has smaller false positives than the other two methods, since it may enforce more sparsity for this case. Our method performs better than glasso for these two models.

Table 2.

The Comparison Between the Proposed Method and Two Other Methods for Four Constant Coefficient Models and a Varying Coefficient Model

		AR(1) model				AR(2) model				Scale-free network				Random model				Varying coefficient model
Sample size	Method	FP	FN	FPR	FNR	FP	FN	FPR	FNR	FP	FN	FPR	FNR	FP	FN	FPR	FNR	FP	FN	FPR	FNR
n = 200	vcm	16	0	0.0134	0.0000	16	0	0.0137	0.0000	7	14	0.0055	0.1892	11	0	0.0090	0.0000	13	4	0.0111	0.0528
	space	0	0	0.0000	0.0000	2	0	0.0017	0.0000	12	6	0.0102	0.0810	13	0	0.0111	0.0000	27	49	0.0229	0.6622
	glasso	28	0	0.0238	0.0000	17	0	0.0144	0.0000	13	6	0.0116	0.0540	5	0	0.0048	0.0000	6	49	0.0051	0.6622
n = 300	vcm	7	0	0.0058	0.0000	7	0	0.0062	0.0000	11	7	0.0094	0.0946	5	0	0.0045	0.0000	14	4	0.0117	0.0539
	space	3	0	0.0025	0.0000	2	0	0.0017	0.0000	7	3	0.0059	0.0405	4	0	0.0033	0.0000	19	49	0.0161	0.6622
	glasso	12	0	0.0102	0.0000	15	0	0.0127	0.0000	2	7	0.0017	0.0945	0	0	0.0000	0.0000	7	49	0.0059	0.6622
n = 400	vcm	4	0	0.0033	0.0000	4	0	0.0033	0.0000	12	3	0.0098	0.0405	3	0	0.0026	0.0000	13	4	0.0111	0.0578
	space	0	0	0.0000	0.0000	0	0	0.0000	0.0000	20	3	0.0170	0.0405	14	0	0.0115	0.0000	33	48	0.0281	0.6486
	glasso	15	0	0.0127	0.0000	8	0	0.0068	0.0000	3	14	0.0025	0.1892	2	0	0.0016	0.0000	12	49	0.0102	0.6622

The simulations are based on 100 iterations. Four measures are reported, where FP and FN are rounded to whole numbers.

AR, ; FNR, false negative rate; FPR, false positive rate.

Example 2. The Scale-Free Network.

In this simulation setting, we consider one version of the inverse covariance matrix that is randomly generated from an R function huge.generator in huge library. The results are shown in Table 2. In general, our method has larger FP numbers than space but smaller than glasso. And there is no big difference between the three methods in FN numbers.

Example 3. The Random Matrix Model.

For this model we still have one matrix. For the sparse inverse covariance matrix, we set the probability \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $$\pi$$ \end{document} to \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $$3 / p = 0.03$$ \end{document} . That is, the off diagonal entry of \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $${ \sum \nolimits^{ - { \bf{1}}}}$$ \end{document} is 0.3 with probability \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $$\pi = 0.03$$ \end{document} and 0 with probability 0.97. In Table 2, again we observe that the false negatives are well controlled by the three methods. Our method performs similar to space. In this example, glasso is the best method since it has the smallest FP number. These three examples illustrate that when the coefficients are not varying with the covariate, our proposed vcm method has good performance.

3.1.2. Varying coefficient model

There are many possible structures for the vcm, here we just use a varying coefficient \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $$AR ( 1 )$$ \end{document} model for illustration purposes. We construct \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $${ \sum \nolimits^{ - { \bf{1}}}} ( y )$$ \end{document} by the following scheme. First, generate a band matrix \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $$A$$ \end{document} with bandwidth 1. The diagonal elements are 0 and nonzero coefficients are \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $$20 \sin ( y )$$ \end{document} . Second, let \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $${ \sum \nolimits^{ - { \bf{1}}}} ( y ) = A / p + {I_p}.$$ \end{document} When \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $$p$$ \end{document} is large, \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $${ \sum \nolimits^{ - { \bf{1}}}} ( y )$$ \end{document} is a positive definite band matrix with bandwidth 1, where the diagonal elements are 1. Note that in this model, the nonzero elements of \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $${ \sum \nolimits^{ - { \bf{1}}}} ( y )$$ \end{document} is \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $$c \cdot \sin ( y )$$ \end{document} , where \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $$c$$ \end{document} is a fixed constant. In our setting, \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $$c = 0.4$$ \end{document} . So, the estimated coefficients should form a sine curve with respect to \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $$y$$ \end{document} . We use wild bootstrap to obtain the confidence bands where we set \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $${N_b} = 1000$$ \end{document} . We let \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $$\mu ( y ) = [ \sin ( y ) , \sin ( y ) , \sin ( y ) , \sin ( y ) , 0 , \ldots , 0 ]^{\prime}$$ \end{document} , which means the first four are differentiated features. In this example, \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $${y_1} , \ldots , {y_n}$$ \end{document} are uniformly distributed in \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $$[ 0 , 2 \pi ]$$ \end{document} .

The results are given in Table 2. We observe that the space and glasso methods cannot correctly identify the true nonzero elements when the coefficients are functions of the covariate. The number of false negatives is around 49, which is close to the number of true positives. The proposed vcm method has false negatives close or equal to zero, which indicates that it can identify the nonzero elements correctly. The number of false positives for vcm is larger than for glasso but smaller than for space. We further conduct inference for the nonzero coefficient functions. In Supplementary Figure S1(a) of the Supplementary Material, we plot the estimates and the corresponding confidence bands for 25 nonzero coefficient functions. We see that the estimates are good fits to the true curves for most cases and the true coefficient curve is covered in the confidence bands. Supplementary Figure S1(b) depicts the estimates and the confidence bands for the means \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $$\mu ( y )$$ \end{document} . In the true model, the first four are nonzero functions and others are zero. The estimates and the confidence bands capture these features well.

3.1.3. Summary

Overall, the performance of our method depends on the model setting. In case of constant coefficient models, either the space method or glasso perform slightly better than our method. This is due to the fact that the space and glasso methods fit the true model when the coefficients are constant. In the vcm, the advantage of our method becomes significant, in particular, for identifying the edges in the graph, which is important in network construction when the network is differentiated at different phenotype levels.

3.2. Real data

In this section, we apply our method to the metagenomic count data. This data set includes 377 features and 310 samples. We use body mass index (BMI) as our phenotype variable, \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $$y$$ \end{document} . We discard those genera with average read less than 5 and remove three outliers based on the BMI values. The remaining data have 46 variables with high abundances and 307 samples for each genus. Data were normalized with the proportion and arcsin transformation, so the input matrix \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $$Z$$ \end{document} is a normalized taxa matrix following multivariate Gaussian distribution. We use SCAD penalty as our penalty function with \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $$\gamma = 3.7$$ \end{document} , then we run our program to select the optimal tuning parameter \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $${ { \lambda }}$$ \end{document} and interior knots by using the BIC. The common network is shown in Figure 4. We not only want to determine the change of correlations across different clinical conditions and to construct the network structures, but also interested in identifying which features are changing with respect to BMI. For each \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $$j$$ \end{document} , we use the wild bootstrap procedures to construct simultaneous confidence band for \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $${ \mu _j} ( y )$$ \end{document} . Some estimates of \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $${ \mu _y}$$ \end{document} are shown in Figure 2. We rank these genera by its \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $$p$$ \end{document} value. The \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $$p$$ \end{document} value is calculated according to the method described in Section 2.5. Small \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $$p$$ \end{document} value means that \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $${ \mu _j} ( y )$$ \end{document} is actually far away from its center \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $${ \bar z_j}$$ \end{document} , which implies a strong association with the phenotype variable \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $$y$$ \end{document} (BMI). The results are shown in Table 3.

FIG. 2.

Networks constructed across clinical conditions for the real data.

Table 3.

\documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $$p$$ \end{document} Value for 46 Taxa. Small \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $$p$$ \end{document} Value Implies \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $${ \mu _j} ( y )$$ \end{document} is Changed Under Different Conditions

p Value for 46 taxa
Collinsella	Faecalibacterium	Anaerotruncus	Prevotella	Sutterella	Acetivibrio	Acidaminococcus
0.001557951	0.007524841	0.016318952	0.024868580	0.043738394	0.050680600	0.057123471
Candidatus Phytoplasma	Parabacteroides	Lachnobacterium	Lachnospira	Roseburia	Oscillospira	Coprococcus
0.063555498	0.069381527	0.070691111	0.073462994	0.076860826	0.077038225	0.081355583
Turicibacter	Anaerofustis	Clostridium	Lactobacillus	Aphanothece	Subdoligranulum	Incertae sedis
0.089880754	0.109993814	0.127659304	0.152920700	0.159558940	0.166045373	0.167860372
Eubacterium	Odoribacter	Mitsuokella	Bacteroides	Butyrivibrio	Alistipes	J2.29
0.181689982	0.262140071	0.265265557	0.278605901	0.286860849	0.329254537	0.330960027
Dialister	Streptococcus	Ruminococcus	Dorea	Treponema	Adlercreutzia	Catenibacterium
0.335849803	0.341022784	0.341038181	0.353564086	0.367057596	0.376976322	0.390724080
Fusobacterium	Blautia	Succinivibrio	Phascolarctobacterium	Megamonas	Peptococcus	Escherichia
0.397224083	0.475954833	0.516351591	0.581397375	0.601874573	0.607808991	0.610656122
Pectinatus	Tenacibaculum	Paracoccus	Desulfovibrio
0.629212706	0.657244479	0.692471937	0.778485521

As shown in Table 3, there are five genera with \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $$p$$ \end{document} value less than 0.05, including Collinsella, Faecalibacterium, Anaerotruncus, Prevotella, and Sutterella, indicating there are statistically significant associations between BMI (obesity) and relative abundances of these five genera. Among the five genera, Faecalibacterium and Prevotella are known to be associated with obesity and other diseases (Underwood, 2014; Brahe et al., 2015b). Even though no linear association between BMI and taxa at genus level was identified in the original study, BMI was shown to be associated with both Faecalibacterium:prausnitzii and Anaerotruncus:colihominis at species level (Zupancic et al., 2012), indicating that there are nonlinear associations between obesity and the five identified genera. We can actually see how the abundance of taxa changes with respect to BMI from the plot of estimates. For instance, the first plot in the second row of Figure 2 shows the nonlinear relationship between BMI and abundance of Anaerotruncus. Clearly, we see that the abundance is rapidly changing when the BMI is not normal, implying Anaerotruncus might be associated with obesity.

In addition, the coabundance network in Figure 4 indicates the interactions among taxa. Moreover, we are able to see the correlations between taxa as functions of BMI. Figure 3 shows that most of the partial correlations on the network vary nonlinearly with BMI. Acidaminococcus and Dialister, Catenibacterium and Dialister, Blautia and Dorea, Collinsella and Dorea, and Blautia and Coprococcus are correlated when BMI is too high or too low, indicating that BMI is associated with the gain or loss of the correlations between these taxa. However, Aphanothece and Desulfovibrio are correlated but may not be associated with BMI (in this network, the connected taxa are correlated regardless of BMI). Out of 46 taxa on the coabundance network, 18 of them have 6 or more node degrees. Those taxa include Desulfovibrio (6), Dialister (6), Dorea (6), Faecalibacterium (6), Lachnospira (6), Lactobacillus (6), Paracoccus (6), Ruminococcus (6), Blautia (7), Collinsella (7), Odoribacter (7), Oscillospira (7), Aphanothece (8), Eubacterium (8), Incertae sedis (8), Subdoligranulum (8), Sutterella (8), and Clostridium (9).

FIG. 3.

Plot of \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $${ \hat \mu _1} ( y ) , \cdots , { \hat \mu _{10}} ( y )$$ \end{document} (solid line) and the corresponding confidence band (dashed line). \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $$x$$ \end{document} and \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $$y$$ \end{document} axes represent BMI and transformed relative abundance, respectively. Complete plots are shown in Supplementary Material.

FIG. 4.

Plot of 10 estimates of \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $$\hat \theta ( y )$$ \end{document} [solid line, \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $$x$$ \end{document} axis: BMI, \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $$y$$ \end{document} axis:estimate of \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $$\hat \theta ( y )$$ \end{document} ] and the corresponding confidence bands (dashed line). More plots are shown in Supplementary Figure S2 of the Supplementary Material.

The taxa with the highest degree (9) is Clostridium. Clostridium together with Ruminococcus, Blautia, Eubacterium, and six other taxa directly connecting to them may function together as a subnetwork. These taxa have been reported to be associated with obesity in different studies (Kasai et al., 2015; Brahe et al., 2015a). The results from our network analysis provide additional information about how these taxa function together. Moreover, one of the three BMI-associated taxa, Faecalibacterium, has the node degree of 6. The direct connections between Faecalibacterium and six other taxa including Dorea, Subdoligranulum, and Lactobacillus may also have biological implications. Further studies are required to evaluate the biological significance of these taxa.

We also applied the glasso and space methods on this real data. The degrees of the top 10 genera are shown in Table 4. The total edges from vcm is 106 including 24 significant edges and 84 nonsignificant edges. Although most of the edges are not significant, all selected edges are kept in the network. Those “non-significant” correlations could be caused by weak signal, but they still can provide some biological information. Comparing with the other two methods, our method provides more information on how the network structure changes conditionally on BMI. Although it seems that they both have similar taxa with high degrees, most variables have degree 0 in the network estimated by the glasso method and many taxa have degree 0 or 1 by space method (estimated networks are shown in Supplementary Figures S3 and S4 in the Supplementary Material).

Table 4.

Comparison of Three Methods on the Real data

space		glasso		vcm
Incertae sedis	9	Eubacterium	10	Clostridium	9
Eubacterium	8	Clostridium	8	Sutterella	8
Blautia	8	Incertae sedis	7	Subdoligranulum	8
Subdoligranulum	7	Blautia	7	Incertae sedis	8
Coprococcus	7	Aphanothece	7	Eubacterium	8
Clostridium	7	Acetivibrio	7	Aphanothece	8
Ruminococcus	6	Coprococcus	5	Oscillospira	7
Aphanothece	6	Anaerotruncus	5	Odoribacter	7
Lachnospira	5	Ruminococcus	3	Collinsella	7
Dorea	5	Oscillospira	3	Blautia	7
Total edges	64	Total edges	43	Total edges	106

Degree of each genera and total edges in the networks estimated by graphical lasso (glasso), space method (space), and varying coefficient model (vcm) method.

4. Conclusions and Discussions

We proposed a nonparametric regularized regression method for network construction and genera selection. We also developed a wild bootstrap method for constructing the simultaneous confidence band for conducting inference on the coefficient functions. We compare the performance of the proposed method with other two popular methods, the glasso and space methods, under different graphical models, including AR(1), AR(2), random matrix model, scale-free network, and a vcm. We tested our method by using the simulated data from the designed models. Compared with the other two methods, the proposed method has better performance when the network structure changes under different clinical conditions.

Footnotes

Acknowledgments

The research of Liu is partially supported by NSF grant DMS-222381. The research of Ma is supported, in part, by the U.S. NSF grant DMS-13-06972 and Hellman Fellowship.

Author Disclosure Statement

No competing financial interests exist.

References

Alekseyenko

A.V.

, Perez-Perez

G.I.

, De Souza

, et al. 2013. Community differentiation of the cutaneous microbiota in psoriasis. Microbiome, 1, 31.

Allen

, Liu

, et al. 2013. A local Poisson graphical model for inferring networks from sequencing data. IEEE Trans. Nanobioscience, 12, 189–198.

Brahe

L.K.

, Le Chatelier

, Prifti

, et al. 2015a. Dietary modulation of the gut microbiota–a randomised controlled trial in obese postmenopausal women. Br. J. Nutr., 114, 406–417.

Brahe

L.K.

, Le Chatelier

, Prifti

, et al. 2015b. Specific gut microbiota features and metabolic markers in postmenopausal women with obesity. Nutr. Diabetes, 5, e159.

Breheny

, and Huang

2009. Penalized methods for bi-level variable selection. Stat. Interface, 2, 369–380.

Breheny

, and Huang

2015. Group descent algorithms for nonconvex penalized linear and logistic regression models with grouped predictors. Stat. Comput., 25, 173–187.

de Boor

2001. A Practical Guide to Splines. Applied Mathematical Sciences. Springer-Verlag, New York.

Donoho

D.L.

, and Johnstone

I.M.

1994. Ideal spatial adaptation by wavelet shrinkage. Biometrika, 81, 425–455.

Fan

, and Li

2001. Variable selection via nonconcave penalized likelihood and its oracle properties. J. Am. Stat. Assoc., 96, 1348–1360.

10.

Friedman

, Hastie

, and Tibshirani

2008. Sparse inverse covariance estimation with the graphical lasso. Biostatistics, 9, 432–441.

11.

Hardle

, and Mammen

1993. Comparing nonparametric versus parametric regression fits. Ann. Stat., 21, 1926–1947.

12.

Huang

, Breheny

, and Ma

2012. A selective review of group selection in high-dimensional models. Stat. Sci., 27, 481–499.

13.

Kasai

, Sugimoto

, Moritani

, et al. 2015. Comparison of the gut microbiota composition between obese and non-obese individuals in a Japanese population, as analyzed by terminal restriction fragment length polymorphism and next-generation sequencing. BMC Gastroenterol. 15, 100.

14.

Krämer

, Schäfer

, and Boulesteix

A.-L.

2009. Regularized estimation of large-scale gene association networks using graphical Gaussian models. BMC Bioinformatics, 10, 384.

15.

Lauritzen

S.L.

1996. Graphical Models. Oxford Statistical Science Series. The Clarendon Press, Oxford University Press, Oxford Science Publications, New York.

16.

Liu

, Hsiao

, Cantarel

B.L.

, et al. 2011. Sparse distance-based learning for simultaneous multiclass classification and feature selection of metagenomic data. Bioinformatics, 27, 3242–3249.

17.

Liu

, Sun

, Braun

, et al. 2015. Multilevel regularized regression for simultaneous taxa selection and network construction with metagenomic count data. Bioinformatics, 31, 1067–1074.

18.

Meinshausen

, and Bühlmann

2006. High-dimensional graphs and variable selection with the lasso. Ann. Stat., 34, 1436–1462.

19.

Peng

, Zhou

, and Zhu

2009. Partial correlation estimation by joint sparse regression models. J. Am. Stat. Assoc., 104, 735–746.

20.

Schloss

P.D.

, and Handelsman

2005. Introducing dotur, a computer program for defining operational taxonomic units and estimating species richness. Appl. Environ. Microbiol., 71, 1501–1506.

21.

Singleton

D.R.

, Furlong

M.A.

, Rathbun

S.L.

, et al. 2001. Quantitative comparisons of 16s rrna gene sequence libraries from environmental samples. Appl. Environ. Microbiol., 67, 4374–4376.

22.

Tanaseichuk

, Borneman

, and Jiang

2013. Phylogeny-based classification of microbial communities. Bioinformatics, 30, 449–56.

23.

Underwood

M.A.

2014. Intestinal dysbiosis: Novel mechanisms by which gut microbes trigger and prevent disease. Prev. Med., 65, 133–137.

24.

White

J.R.

, Nagarajan

, and Pop

2009. Statistical methods for detecting differentially abundant features in clinical metagenomic samples. PLoS Comput. Biol., 5, 1000352.

25.

Yang

2008. Confidence band for additive regression model. J. Data Sci., 6, 207–217.

26.

Yuan

, and Lin

2007. Model selection and estimation in the Gaussian graphical model. Biometrika, 94, 19–35.

27.

Zhang

C.-H.

2010. Nearly unbiased variable selection under minimax concave penalty. Ann. Stat., 38, 894–942.

28.

Zhang

, and Mallick

B.K.

2013. Inferring gene networks from discrete expression data. Biostatistics, 14, 708–22.

29.

Zupancic

M.L.

, Cantarel

B.L.

, Liu

, et al. 2012. Analysis of the gut microbiota in the old order amish and its relation to the metabolic syndrome. PLoS One, 7, e43052.

Supplementary Material

Please find the following supplemental material available below.

For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.

For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.

0.00 MB

1.50 MB