Fits regularization paths for sparse group-lasso penalized learning problems at a
sequence of regularization parameters `lambda`

.
Note that the objective function for least squares is
$$RSS/(2n) + \lambda penalty$$
Users can also tweak the penalty by choosing a different penalty factor.

## Usage

```
sparsegl(
x,
y,
group = NULL,
family = c("gaussian", "binomial"),
nlambda = 100,
lambda.factor = ifelse(nobs < nvars, 0.01, 1e-04),
lambda = NULL,
pf_group = sqrt(bs),
pf_sparse = rep(1, nvars),
intercept = TRUE,
asparse = 0.05,
standardize = TRUE,
lower_bnd = -Inf,
upper_bnd = Inf,
weights = NULL,
offset = NULL,
warm = NULL,
trace_it = 0,
dfmax = as.integer(max(group)) + 1L,
pmax = min(dfmax * 1.2, as.integer(max(group))),
eps = 1e-08,
maxit = 3e+06
)
```

## Arguments

- x
Double. A matrix of predictors, of dimension \(n \times p\); each row is a vector of measurements and each column is a feature. Objects of class

`Matrix::sparseMatrix`

are supported.- y
Double/Integer/Factor. The response variable. Quantitative for

`family="gaussian"`

and for other exponential families. If`family="binomial"`

should be either a factor with two levels or a vector of integers taking 2 unique values. For a factor, the last level in alphabetical order is the target class.- group
Integer. A vector of consecutive integers describing the grouping of the coefficients (see example below).

- family
Character or function. Specifies the generalized linear model to use. Valid options are:

`"gaussian"`

- least squares loss (regression, the default),`"binomial"`

- logistic loss (classification)

For any other type, a valid

`stats::family()`

object may be passed. Note that these will generally be much slower to estimate than the built-in options passed as strings. So for example,`family = "gaussian"`

and`family = gaussian()`

will produce the same results, but the first will be much faster.- nlambda
The number of

`lambda`

values - default is 100.- lambda.factor
A multiplicative factor for the minimal lambda in the

`lambda`

sequence, where`min(lambda) = lambda.factor * max(lambda)`

.`max(lambda)`

is the smallest value of`lambda`

for which all coefficients are zero. The default depends on the relationship between \(n\) (the number of rows in the matrix of predictors) and \(p\) (the number of predictors). If \(n \geq p\), the default is`0.0001`

. If \(n < p\), the default is`0.01`

. A very small value of`lambda.factor`

will lead to a saturated fit. This argument has no effect if there is user-defined`lambda`

sequence.- lambda
A user supplied

`lambda`

sequence. The default,`NULL`

results in an automatic computation based on`nlambda`

, the smallest value of`lambda`

that would give the null model (all coefficient estimates equal to zero), and`lambda.factor`

. Supplying a value of`lambda`

overrides this behaviour. It is likely better to supply a decreasing sequence of`lambda`

values than a single (small) value. If supplied, the user-defined`lambda`

sequence is automatically sorted in decreasing order.- pf_group
Penalty factor on the groups, a vector of the same length as the total number of groups. Separate penalty weights can be applied to each group of \(\beta\)s to allow differential shrinkage. Can be 0 for some groups, which implies no shrinkage, and results in that group always being included in the model (depending on

`pf_sparse`

). Default value for each entry is the square-root of the corresponding size of each group. Because this default is typical, these penalties are not rescaled.- pf_sparse
Penalty factor on l1-norm, a vector the same length as the total number of columns in

`x`

. Each value corresponds to one predictor Can be 0 for some predictors, which implies that predictor will be receive only the group penalty. Note that these are internally rescaled so that the sum is the same as the number of predictors.- intercept
Whether to include intercept in the model. Default is TRUE.

- asparse
The relative weight to put on the \(\ell_1\)-norm in sparse group lasso. Default is

`0.05`

(resulting in`0.95`

on the \(\ell_2\)-norm).- standardize
Logical flag for variable standardization (scaling) prior to fitting the model. Default is TRUE.

- lower_bnd
Lower bound for coefficient values, a vector in length of 1 or of length the number of groups. Must be non-positive numbers only. Default value for each entry is

`-Inf`

.- upper_bnd
Upper for coefficient values, a vector in length of 1 or of length the number of groups. Must be non-negative numbers only. Default value for each entry is

`Inf`

.- weights
Double vector. Optional observation weights. These can only be used with a

`stats::family()`

object.- offset
Double vector. Optional offset (constant predictor without a corresponding coefficient). These can only be used with a

`stats::family()`

object.- warm
List created with

`make_irls_warmup()`

. These can only be used with a`stats::family()`

object, and is not typically necessary even then.- trace_it
Scalar integer. Larger values print more output during the irls loop. Typical values are

`0`

(no printing),`1`

(some printing and a progress bar), and`2`

(more detailed printing). These can only be used with a`stats::family()`

object.- dfmax
Limit the maximum number of groups in the model. Default is no limit.

- pmax
Limit the maximum number of groups ever to be nonzero. For example once a group enters the model, no matter how many times it exits or re-enters model through the path, it will be counted only once.

- eps
Convergence termination tolerance. Defaults value is

`1e-8`

.- maxit
Maximum number of outer-loop iterations allowed at fixed lambda value. Default is

`3e8`

. If models do not converge, consider increasing`maxit`

.

## Value

An object with S3 class `"sparsegl"`

. Among the list components:

`call`

The call that produced this object.`b0`

Intercept sequence of length`length(lambda)`

.`beta`

A`p`

x`length(lambda)`

sparse matrix of coefficients.`df`

The number of features with nonzero coefficients for each value of`lambda`

.`dim`

Dimension of coefficient matrix.`lambda`

The actual sequence of`lambda`

values used.`npasses`

Total number of iterations summed over all`lambda`

values.`jerr`

Error flag, for warnings and errors, 0 if no error.`group`

A vector of consecutive integers describing the grouping of the coefficients.`nobs`

The number of observations used to estimate the model.

If `sparsegl()`

was called with a `stats::family()`

method, this may also
contain information about the deviance and the family used in fitting.

## References

Liang, X., Cohen, A., Sólon Heinsfeld, A., Pestilli, F., and
McDonald, D.J. 2024.
*sparsegl: An R Package for Estimating Sparse Group Lasso.*
Journal of Statistical Software, Vol. 110(6): 1–23.
doi:10.18637/jss.v110.i06
.

## See also

`cv.sparsegl()`

and the `plot()`

,
`predict()`

, and `coef()`

methods for `"sparsegl"`

objects.

## Examples

```
n <- 100
p <- 20
X <- matrix(rnorm(n * p), nrow = n)
eps <- rnorm(n)
beta_star <- c(rep(5, 5), c(5, -5, 2, 0, 0), rep(-5, 5), rep(0, (p - 15)))
y <- X %*% beta_star + eps
groups <- rep(1:(p / 5), each = 5)
fit <- sparsegl(X, y, group = groups)
yp <- rpois(n, abs(X %*% beta_star))
fit_pois <- sparsegl(X, yp, group = groups, family = poisson())
```