Estimates a directed network using a lasso (L1) penalty.
netEst.dir(x, zero = NULL, one = NULL, lambda, verbose = FALSE, eps = 1e-08)
A list with components
The weighted adjacency matrix of dimension \(p \times p\). This is the matrix that will be used in NetGSA
.
The influence matrix of dimension \(p \times p\).
The values of tuning parameters used.
The \(p \times n\) data matrix.
(Optional) indices of entries of the matrix to be constrained to be zero. The input should be a matrix of \(p \times p\), with 1 at entries to be constrained to be zero and 0 elsewhere.
(Optional) indices of entries of the matrix to be kept regardless of the regularization parameter for lasso. The input is similar to that of zero
.
(Non-negative) numeric scalar or a vector of length \(p-1\) representing the regularization parameters for nodewise lasso. If lambda
is a scalar, the same penalty will be used for all \(p-1\) lasso regressions. By default (lambda=NULL
), the vector of lambda
is defined as
$$\lambda_j(\alpha) = 2 n^{-1/2} Z^*_{\frac{\alpha}{2p(j-1)}}, \quad j=2,\ldots,p.$$
Here \(Z^*_q\) represents the \((1-q)\)-th quantile of the standard normal distribution and \(\alpha\) is a positive constant between 0 and 1. See Shojaie and Michailidis (2010a) for details on the choice of tuning parameters.
Whether to print out information as estimation proceeds. Default = FALSE
.
(Non-negative) numeric scalar indicating the tolerance level for differentiating zero and non-zero edges: entries with magnitude \(<\) eps
will be set to 0.
Ali Shojaie
The function netEst.dir
performs constrained estimation of a directed network using a lasso (L1) penalty, as described in Shojaie and Michailidis (2010a). Two sets of constraints determine subsets of entries of the weighted adjacency matrix that should be exactly zero (the option zero
argument), or should take non-zero values (option one
argument). The remaining entries will be estimated from data.
The arguments one
and/or zero
can come from external knowledge on the 0-1 structure of underlying network, such as a list of edges and/or non-edges learned from available databases.
In this function, it is assumed that the columns of \(x\) are ordered according to a correct (Wald) causal order, such that no \(x_j\) is a parent of \(x_k\) (\(k \le j\)). Given the causal ordering of nodes, the resulting adjacency matrix is lower triangular (see Shojaie & Michailidis, 2010b). Thus, only lower triangular parts of zero
and one
are used in this function. For this reason, it is important that both of these matrices are also ordered according to the causal order of the nodes in \(x\). To estimate the network, first each node is regressed on the known edges (one
). The residual obtained from this regression is then used to find the additional edges, among the nodes that could potentially interact with the given node (those not in zero
).
This function is closely related to NetGSA
, which requires the weighted adjacency matrix as input. When the user does not have complete information on the weighted adjacency matrix, but has data (not necessarily the same as the x
in NetGSA
) and external information (one
and/or zero
) on the adjacency matrix, then netEst.dir
can be used to estimate the remaining interactions in the adjacency matrix using the data.
Further, when it is anticipated that the adjacency matrices under different conditions are different, and data from different conditions are available, the user needs to run netEst.dir
separately to obtain estimates of the adjacency matrices under each condition.
The algorithm used in netEst.undir
is based on glmnet
. Please refer to glmnet
for computational details.
Shojaie, A., & Michailidis, G. (2010a). Penalized likelihood methods for estimation of sparse high-dimensional directed acyclic graphs. Biometrika 97(3), 519-538. https://academic.oup.com/biomet/article-abstract/97/3/519/243918
Shojaie, A., & Michailidis, G. (2010b). Network enrichment analysis in complex experiments. Statistical applications in genetics and molecular biology, 9(1), Article 22. https://pubmed.ncbi.nlm.nih.gov/20597848/.
Shojaie, A., & Michailidis, G. (2009). Analysis of gene sets based on the underlying regulatory network. Journal of Computational Biology, 16(3), 407-426. https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3131840/
prepareAdjMat
, glmnet