This document shows how to extend the holiglm package with new likelihood / objective functions or constraints. To extend the holiglm package users are required to be familiar with the ROI package (Theußl, Schwendinger, and Hornik 2020). Additional examples and information about ROI can be found on the ROI-homepage.
Sys.setenv(ROI_LOAD_PLUGINS = FALSE)
library("slam")
library("ROI")
#> ROI: R Optimization Infrastructure
#> Registered solver plugins: nlminb.
#> Default solver: auto.
library("holiglm")
#> Loading required package: ROI.plugin.ecosFor illustration purposes we will make use of the Default data set in package ISLR (James et al. 2021), to which we add two more columns rand_1 and rand_2 containing variables unrelated to the response.
data("Default", package = "ISLR")
Default[["rand_1"]] <- rnorm(nrow(Default))
Default[["rand_2"]] <- rnorm(nrow(Default))Currently holiglm supports a set of specific constraints from the holistic regression literature and general linear constraints. However, it is possible to specify custom constraints, as long as they are supported by a linear, quadratic or conic optimization solver available from ROI.
When implementing new constraints it is important that the scaling of the model matrix is considered. There are two options to account for the scaling,
In the following we show, how to implement a linear constraint and custom conic constraint, with and without scaling.
When adding custom constraints to the model, the scaling of the model matrix has to be considered. The main purpose of the scaling is to simplify the choice of the big-M constraints. Therefore, if no sparsity constraint is imposed on the model the easiest way to add additional constraints is to turn off the scaling and add the constraints to the model object. In the following example we show how this can be accomplished for linear constraints.
The holiglm package allows to impose linear constraints on the coefficients and the indicator variables of the coefficients via the linear() function. However, to start with a simple example we will add linear constraints to the model without using the linear() function.
Let us assume we want to impose the constraint that the coefficients of variables balance and income are equal. This can be easily accomplished by,
To obtain the model object, function hglm() with parameter dry_run set to TRUE can be used.
model <- hglm(default ~ ., binomial(), scaler = "off", data = Default, dry_run = TRUE)
model[["constraints"]]
#> list()Since the scaling is turned off, the constraint can be added by adding the corresponding ROI constraint. Note, the variables "balance" and "income" are on the 3rd and 4th position in the model matrix.
match(c("balance", "income"), colnames(model[["x"]]))
#> [1] 3 4Therefore, we impose the constraints on the 3rd and 4th column of the constraint matrix.
L <- matrix(c(0, 0, 1, -1), 1)
model[["constraints"]][[1]] <- L_constraint(L, "==", 0)To solve the altered model, function hglm_fit() can be used.
hglm_fit(model)
#> Error in as.OP.hglm_model(x = model): dimension missmatch in as.OPLooking at the output above shows that balance and income have now the same coefficient as required by the constraint. However, it seems important to point out that for linear constraints it is typically easier to use the linear() function directly.
fit <- hglm(default ~ ., binomial(), data = Default,
            constraints = linear(c(balance = 1, income = -1), "==", 0))
#> Warning: In hglm_fit: Binding linear constraints detected. The standard errors are corrected as described in the vignettes.
fit
#> 
#> Call:  hglm(formula = default ~ ., family = binomial(), data = Default, 
#>     constraints = linear(c(balance = 1, income = -1), "==", 0))
#> 
#> Coefficients:
#> (Intercept)   studentYes      balance       income       rand_1       rand_2  
#>  -4.357e+00    8.669e-01    2.039e-05    2.039e-05   -4.784e-02    3.914e-02  
#> 
#> Degrees of Freedom: 9999 Total (i.e. Null);  9994 Residual
#> Null Deviance:       2921 
#> Residual Deviance: 2898  AIC: 2910Looking more closely at the constraints,
model_1 <- hglm(default ~ ., binomial(), data = Default, dry_run = TRUE,
                constraints = linear(c(balance = 1, income = -1), "==", 0))
c(ncol(L_constraint(L, "==", 0)[["L"]]), ncol(model_1[["constraints"]][[1]][["L"]]))
#> [1]     4 20011it is easy to recognize that the dimension of 4 of the linear constraint specified above is not equal to the dimension of the constraint obtained by using linear() directly in hglm(). This happens due to the auxiliary variables which enter the problem when specifing the likelihood function. However, this is no problem since the dimensions of the constraints are internally matched when as.OP is called.
This example shows how to add the non-linear constraint \(\sqrt{\beta_{\text{balance}}^2 + \beta_{\text{income}}^2} \leq 1\) to the model. To demonstrate the effect of the constraint we scale the credit default data, such that the coefficients of balance and income are both bigger than one.
credit_default <- Default
credit_default[, 3] <- credit_default[, 3] / 1000
credit_default[, 4] <- credit_default[, 4] / 1000000
credit_default[["random"]] <- rnorm(nrow(credit_default))
glm(default ~ ., family = binomial(), data = credit_default)
#> 
#> Call:  glm(formula = default ~ ., family = binomial(), data = credit_default)
#> 
#> Coefficients:
#> (Intercept)   studentYes      balance       income       rand_1       rand_2       random  
#>   -10.88340     -0.63742      5.73652      3.23051     -0.06304      0.03268      0.06083  
#> 
#> Degrees of Freedom: 9999 Total (i.e. Null);  9993 Residual
#> Null Deviance:       2921 
#> Residual Deviance: 1570  AIC: 1584The constraint \(\sqrt{\beta_{\text{balance}}^2 + \beta_{\text{income}}^2} \leq 1\) can be modeled by making use of the second order cone,
C <- rbind(c(0, 0, 0, 0), c(0, 0, -1, 0), c(0, 0, 0, -1))
socp <- C_constraint(C, K_soc(3), c(1, 0, 0))Similar to adding linear constraints, first the model has to be generated,
model_2 <- hglm(default ~ ., binomial(), scaler = "off",
                data = credit_default, dry_run = TRUE)afterwards the constraint can be added
model_2[["constraints"]][[1]] <- socpand the model can be fit via the hglm_fit() function.
fit <- hglm_fit(model_2, solver = "ecos")
#> Error in as.OP.hglm_model(x = model): dimension missmatch in as.OP
coef(fit)
#>   (Intercept)    studentYes       balance        income        rand_1        rand_2 
#> -4.357400e+00  8.668843e-01  2.039058e-05  2.039058e-05 -4.784277e-02  3.913994e-02Calculating \(\sqrt{\beta_{\text{balance}}^2 + \beta_{\text{income}}^2} \leq 1\)
sqrt(sum(coef(fit)[c("balance", "income")]^2))
#> [1] 2.883664e-05verifies that the constraint is fulfilled.
In this example we combine the sparsity constraint that at most \(3\) features are selected with the linear constraint \(\beta_\text{balance} = \beta_\text{income}\). Using hglm(), this model can be estimated by:
constraints <- c(k_max(3), linear(c(balance = 1, income = -1), "==", 0))
model_3 <- hglm(default ~ ., binomial(), constraints = constraints,
                data = Default, dry_run = TRUE)
model_3[["constraints"]]
#> $bigM
#> An object containing 10 linear constraints.
#> 
#> $global_sparsity_1
#> An object containing 1 linear constraint.
#> 
#> $linear_constraint_1
#> An object containing 1 linear constraint.Inspecting the linear constraint,
str(model_3[["constraints"]][["linear_constraint_1"]])
#> List of 4
#>  $ L    :List of 6
#>   ..$ i       : int [1:2] 1 1
#>   ..$ j       : int [1:2] 3 4
#>   ..$ v       : num [1:2] 3.77e-04 -1.37e-05
#>   ..$ nrow    : int 1
#>   ..$ ncol    : int 20011
#>   ..$ dimnames: NULL
#>   ..- attr(*, "class")= chr "simple_triplet_matrix"
#>  $ dir  : chr "=="
#>  $ rhs  : num 0
#>  $ names: NULL
#>  - attr(*, "n_L_constraints")= int 1
#>  - attr(*, "class")= chr [1:3] "L_constraint" "Q_constraint" "constraint"shows that hglm() internally scales the coefficient matrix of the linear constraint based on the scaling of the model matrix. To scale the linear constraint matrix outside of the hglm function, the scale_constraint_matrix function can be used.
model_4 <- hglm(default ~ ., binomial(), constraints = k_max(3),
                data = Default, dry_run = TRUE)
L <- matrix(c(0, 0, 1, -1, 0, 0), 1)
L <- scale_constraint_matrix(L, attr(model_4[["x"]], "xs"), attr(model_4[["y"]], "ys"))
model_4[["constraints"]][["linear_constraint_1"]] <- L_constraint(L, "==", 0)
hglm_fit(model_4)
#> Warning: In hglm_fit: Binding linear constraints detected. The standard errors are corrected as described in the vignettes.
#> 
#> Call:  NULL
#> 
#> Coefficients:
#> (Intercept)   studentYes      balance       income       rand_1       rand_2  
#>  -4.358e+00    8.672e-01    2.044e-05    2.044e-05    0.000e+00    0.000e+00  
#> 
#> Degrees of Freedom: 9999 Total (i.e. Null);  9996 Residual
#> Null Deviance:       2921 
#> Residual Deviance: 2899  AIC: 2907Again, it can be easily verified, that calling hglm() directly gives the same solution.
constraints <- c(k_max(3), linear(c(balance = 1, income = -1), "==", 0))
hglm(default ~ ., binomial(), constraints = constraints, data = Default)
#> Warning: In hglm_fit: Binding linear constraints detected. The standard errors are corrected as described in the vignettes.
#> 
#> Call:  hglm(formula = default ~ ., family = binomial(), data = Default, 
#>     constraints = constraints)
#> 
#> Coefficients:
#> (Intercept)   studentYes      balance       income       rand_1       rand_2  
#>  -4.358e+00    8.672e-01    2.044e-05    2.044e-05    0.000e+00    0.000e+00  
#> 
#> Degrees of Freedom: 9999 Total (i.e. Null);  9996 Residual
#> Null Deviance:       2921 
#> Residual Deviance: 2899  AIC: 2907This example shows how to add the non-linear constraint \(\sqrt{\beta_{\text{balance}}^2 + \beta_{\text{income}}^2} \leq 1\). Similar to the linear constraint we first generate the model,
constraints <- c(k_max(3), linear(c(balance = 1, income = -1), "==", 0))
model_5 <- hglm(default ~ ., binomial(), constraints = constraints,
                data = credit_default, dry_run = TRUE)formulate the constraint
nvars <- ncol(model_5[["x"]])
C <- simple_triplet_matrix(c(2, 3), c(3, 4), c(-1, -1), ncol = nvars)
C <- scale_constraint_matrix(C, attr(model_5[["x"]], "xs"), attr(model_5[["y"]], "ys"))
socp <- C_constraint(C, K_soc(3), c(1, 0, 0))and add it to the model.
model_5[["constraints"]][["socp_constraint_1"]] <- socp
fit <- hglm_fit(model_5)
#> Warning: In hglm_fit: Binding linear constraints detected. The standard errors are corrected as described in the vignettes.
coef(fit)
#> (Intercept)  studentYes     balance      income      rand_1      rand_2      random 
#>  -4.1311607   0.2670113   0.7071068   0.7071068   0.0000000   0.0000000   0.0000000
sum(coef(fit)[3:4]^2)
#> [1] 1In this section we show how to implement additional currently unsupported objectives. This can be structured into 5 steps,
OP() which can be solved by a mixed integer solver. Here the first \(p+1\) variables of the optimization problem should correspond to \(\beta_0, ..., \beta_p\).hglm() with parameter dry_run set to TRUE."loglikelihood" with the new objective.hglm_fit to fit the model.The least absolute deviation regression problem is defined as, \[ \underset{\boldsymbol\beta}{\text{minimize }} ~~ || \boldsymbol y - X\boldsymbol \beta ||_1 \]
which can be solved via linear programming.
see, e.g., (Charnes, Cooper, and Ferguson 1955).
lad_a <- function(x, y) {
    D <- diag(nrow(x))
    L <- rbind(cbind( x, -D),
               cbind(-x, -D))
    obj <- c(double(ncol(x)), rep.int(1, nrow(x)))
    OP(objective = L_objective(obj),
       constraints = L_constraint(L, leq(2 * nrow(x)), c(y, -y)),
       bounds = V_bound(ld = -Inf, nobj = length(obj)))
}lad_b <- function(x, y) {
    obj <- L_objective(c(double(ncol(x)), rep.int(1, 2 * nrow(x))))
    D <- diag(nrow(x))
    con <- L_constraint(L = cbind(x, D, -D), dir = eq(nrow(x)), rhs = y)
    bou <- V_bound(li = seq_len(ncol(x)), lb = rep.int(-Inf, ncol(x)), nobj = length(obj))
    OP(objective = obj, constraints = con, bounds = bou)
}Third we create the model object,
model_a <- model_b <- hglm(mpg ~ cyl + disp + hp + drat + wt, data = mtcars, dry_run = TRUE)model_a <- update_objective(model_a, lad_a(model_a$x, model_a$y))model_b <- update_objective(model_b, lad_b(model_b$x, model_b$y))Fitting the model gives
hglm_fit(model_a)
#> Warning: In hglm_fit: Binding linear constraints detected. The standard errors are corrected as described in the vignettes.
#> 
#> Call:  NULL
#> 
#> Coefficients:
#> (Intercept)          cyl         disp           hp         drat           wt  
#>    40.19297     -1.13698      0.01174     -0.02853     -0.08884     -3.63177  
#> 
#> Degrees of Freedom: 31 Total (i.e. Null);  26 Residual
#> Null Deviance:       1126 
#> Residual Deviance: 180.2     AIC: 160.1
hglm_fit(model_b)
#> Warning: In hglm_fit: Binding linear constraints detected. The standard errors are corrected as described in the vignettes.
#> 
#> Call:  NULL
#> 
#> Coefficients:
#> (Intercept)          cyl         disp           hp         drat           wt  
#>    40.19297     -1.13698      0.01174     -0.02853     -0.08884     -3.63177  
#> 
#> Degrees of Freedom: 31 Total (i.e. Null);  26 Residual
#> Null Deviance:       1126 
#> Residual Deviance: 180.2     AIC: 160.1the same result as package quantreg.
if (require("quantreg", quietly = TRUE)) {
    fm <- quantreg::rq(mpg ~ cyl + disp + hp + drat + wt, data = mtcars)
    coef(fm)
}
#> 
#> Attaching package: 'SparseM'
#> The following object is masked from 'package:base':
#> 
#>     backsolve
#> (Intercept)         cyl        disp          hp        drat          wt 
#> 40.19297107 -1.13697880  0.01174166 -0.02852600 -0.08883949 -3.63176521Note constraints can be added during the fit using the constraints argument,
constraints <- c(k_max(3))
(fit_a <- hglm_fit(model_a, constraints = constraints))
#> Warning: In hglm_fit: Binding linear constraints detected. The standard errors are corrected as described in the vignettes.
#> 
#> Call:  NULL
#> 
#> Coefficients:
#> (Intercept)          cyl         disp           hp         drat           wt  
#>    38.33674     -1.13592      0.00000     -0.01475      0.00000     -3.01449  
#> 
#> Degrees of Freedom: 31 Total (i.e. Null);  28 Residual
#> Null Deviance:       1126 
#> Residual Deviance: 190.7     AIC: 157.9
(fit_b <- hglm_fit(model_b, constraints = constraints))
#> Warning: In hglm_fit: Binding linear constraints detected. The standard errors are corrected as described in the vignettes.
#> 
#> Call:  NULL
#> 
#> Coefficients:
#> (Intercept)          cyl         disp           hp         drat           wt  
#>    38.33674     -1.13592      0.00000     -0.01475      0.00000     -3.01449  
#> 
#> Degrees of Freedom: 31 Total (i.e. Null);  28 Residual
#> Null Deviance:       1126 
#> Residual Deviance: 190.7     AIC: 157.9model <- model_a
constraints <- c(k_max(3), lower(c(cyl = 3)))
hglm_fit(model_a, constraints = constraints)
#> Warning: In hglm_fit: Binding linear constraints detected. The standard errors are corrected as described in the vignettes.
#> 
#> Call:  NULL
#> 
#> Coefficients:
#> (Intercept)          cyl         disp           hp         drat           wt  
#>     32.8727       3.0000       0.0000      -0.0763       0.0000      -6.4791  
#> 
#> Degrees of Freedom: 31 Total (i.e. Null);  28 Residual
#> Null Deviance:       1126 
#> Residual Deviance: 519   AIC: 190
hglm_fit(model_b, constraints = constraints)
#> Warning: In hglm_fit: Binding linear constraints detected. The standard errors are corrected as described in the vignettes.
#> 
#> Call:  NULL
#> 
#> Coefficients:
#> (Intercept)          cyl         disp           hp         drat           wt  
#>     32.8727       3.0000       0.0000      -0.0763       0.0000      -6.4791  
#> 
#> Degrees of Freedom: 31 Total (i.e. Null);  28 Residual
#> Null Deviance:       1126 
#> Residual Deviance: 519   AIC: 190