Title: | High-Dimensional Ising Model Selection |
---|---|
Description: | Fits an Ising model to a binary dataset using L1 regularized logistic regression and extended BIC. Also includes a fast lasso logistic regression function for high-dimensional problems. Uses the 'libLBFGS' optimization library by Naoaki Okazaki. |
Authors: | Pratik Ramprasad [aut, cre], Jorge Nocedal [ctb, cph], Naoaki Okazaki [ctb, cph] |
Maintainer: | Pratik Ramprasad <[email protected]> |
License: | GPL (>= 2) |
Version: | 0.1.0 |
Built: | 2024-11-05 03:26:27 UTC |
Source: | https://github.com/pratik-r/rising |
Ising Model selection using L1-regularized logistic regression and extended BIC.
ising(X, gamma = 0.5, min_sd = 0, nlambda = 50, lambda.min.ratio = 0.001, symmetrize = "mean")
ising(X, gamma = 0.5, min_sd = 0, nlambda = 50, lambda.min.ratio = 0.001, symmetrize = "mean")
X |
The design matrix. |
gamma |
(non-negative double) Parameter for the extended BIC (default 0.5). Higher gamma encourages sparsity. See references for more details. |
min_sd |
(non-negative double) Columns of |
nlambda |
(positive integer) The number of parameters in the regularization path (default 50). A longer regularization path will likely yield more accurate results, but will take more time to run. |
lambda.min.ratio |
(non-negative double) The ratio |
symmetrize |
The method used to symmetrize the output adjacency matrix. Must be one of "min", "max", "mean" (default), or FALSE. "min" and "max" correspond to the Wainwright min/max, respectively (see reference 1). "mean" corresponds to the coefficient-wise mean of the output adjacency matrix and its transpose. If FALSE, the output matrix is not symmetrized. |
A list containing the estimated adjacency matrix (Theta
) and the optimal regularization parameter for each node (lambda
), as selected by extended BIC.
Ravikumar, P., Wainwright, M. J. and Lafferty, J. D. (2010). High-dimensional Ising model selection using L1-regularized logistic regression. https://arxiv.org/pdf/1010.0311v1
Barber, R.F., Drton, M. (2015). High-dimensional Ising model selection with Bayesian information criteria. https://arxiv.org/pdf/1403.3374v2
## Not run: # simulate a dataset using IsingSampler library(IsingSampler) n = 1e3 p = 10 Theta <- matrix(sample(c(-0.5,0,0.5), replace = TRUE, size = p*p), nrow = p, ncol = p) Theta <- Theta + t(Theta) # adjacency matrix must be symmetric diag(Theta) <- 0 X <- unname(as.matrix(IsingSampler(n, graph = Theta, thresholds = 0, method = "direct") )) m1 <- ising(X, symmetrize = "mean", gamma = 0.5, nlambda = 50) # Visualize output using igraph library(igraph) ig <- graph_from_adjacency_matrix(m1$Theta, "undirected", weighted = TRUE, diag = FALSE) plot.igraph(ig, vertex.color = "skyblue") ## End(Not run)
## Not run: # simulate a dataset using IsingSampler library(IsingSampler) n = 1e3 p = 10 Theta <- matrix(sample(c(-0.5,0,0.5), replace = TRUE, size = p*p), nrow = p, ncol = p) Theta <- Theta + t(Theta) # adjacency matrix must be symmetric diag(Theta) <- 0 X <- unname(as.matrix(IsingSampler(n, graph = Theta, thresholds = 0, method = "direct") )) m1 <- ising(X, symmetrize = "mean", gamma = 0.5, nlambda = 50) # Visualize output using igraph library(igraph) ig <- graph_from_adjacency_matrix(m1$Theta, "undirected", weighted = TRUE, diag = FALSE) plot.igraph(ig, vertex.color = "skyblue") ## End(Not run)
L1 Regularized logistic regression using OWL-QN L-BFGS-B optimization.
logreg(X, y, nlambda = 50, lambda.min.ratio = 0.001, lambda = NULL, scale = TRUE, type = 2)
logreg(X, y, nlambda = 50, lambda.min.ratio = 0.001, lambda = NULL, scale = TRUE, type = 2)
X |
The design matrix. |
y |
Vector of binary observations of length equal to |
nlambda |
(positive integer) The number of parameters in the regularization path (default 50). |
lambda.min.ratio |
(non-negative double) The ratio of |
lambda |
A user-supplied vector of regularization parameters. Under the default option ( |
scale |
(boolean) Whether to scale |
type |
(integer 1 or 2) Type 1 aggregates the input data based on repeated rows in |
A list containing the matrix of fitted weights (wmat
), the vector of regularization parameters, sorted in decreasing order (lambda
), and the vector of log-likelihoods corresponding to lambda
(logliks
).
# simulate some linear regression data n <- 1e3 p <- 100 X <- matrix(rnorm(n*p),n,p) wt <- sample(seq(0,9),p+1,replace = TRUE) / 10 z <- cbind(1,X) %*% wt + rnorm(n) probs <- 1 / (1 + exp(-z)) y <- sapply(probs, function(p) rbinom(1,1,p)) m1 <- logreg(X, y) m2 <- logreg(X, y, nlambda = 100, lambda.min.ratio = 1e-4, type = 1) ## Not run: # Performance comparison library(glmnet) library(microbenchmark) nlambda = 50; lambda.min.ratio = 1e-3 microbenchmark( logreg_type1 = logreg(X, y, nlambda = nlambda, lambda.min.ratio = lambda.min.ratio, type = 1), logreg_type2 = logreg(X, y, nlambda = nlambda, lambda.min.ratio = lambda.min.ratio, type = 2), glmnet = glmnet(X, y, family = "binomial", nlambda = nlambda, lambda.min.ratio = lambda.min.ratio), times = 20L ) ## End(Not run)
# simulate some linear regression data n <- 1e3 p <- 100 X <- matrix(rnorm(n*p),n,p) wt <- sample(seq(0,9),p+1,replace = TRUE) / 10 z <- cbind(1,X) %*% wt + rnorm(n) probs <- 1 / (1 + exp(-z)) y <- sapply(probs, function(p) rbinom(1,1,p)) m1 <- logreg(X, y) m2 <- logreg(X, y, nlambda = 100, lambda.min.ratio = 1e-4, type = 1) ## Not run: # Performance comparison library(glmnet) library(microbenchmark) nlambda = 50; lambda.min.ratio = 1e-3 microbenchmark( logreg_type1 = logreg(X, y, nlambda = nlambda, lambda.min.ratio = lambda.min.ratio, type = 1), logreg_type2 = logreg(X, y, nlambda = nlambda, lambda.min.ratio = lambda.min.ratio, type = 2), glmnet = glmnet(X, y, family = "binomial", nlambda = nlambda, lambda.min.ratio = lambda.min.ratio), times = 20L ) ## End(Not run)
Fits an Ising model to a binary dataset using L1-regularized logistic regression and BIC. Also includes a fast lasso logistic regression function for high-dimensional problems. Uses the 'libLBFGS' optimization library by Naoki Okazaki.
logreg
: L1-regularized logistic regression using OWL-QN L-BFGS-B optimization.
Ising
: Ising Model selection using L1-regularized logistic regression and extended BIC.