| Type: | Package |
| Title: | Multivariate Covariance Generalized Linear Models |
| Version: | 0.9.0 |
| Date: | 2026-01-07 |
| Author: | Wagner Hugo Bonat [aut, cre] |
| Maintainer: | Wagner Hugo Bonat <wbonat@ufpr.br> |
| Description: | Fitting multivariate covariance generalized linear models (McGLMs) to data. McGLM is a general framework for non-normal multivariate data analysis, designed to handle multivariate response variables, along with a wide range of temporal and spatial correlation structures defined in terms of a covariance link function combined with a matrix linear predictor involving known matrices. The models take non-normality into account in the conventional way by means of a variance function, and the mean structure is modelled by means of a link function and a linear predictor. The models are fitted using an efficient Newton scoring algorithm based on quasi-likelihood and Pearson estimating functions, using only second-moment assumptions. This provides a unified approach to a wide variety of different types of response variables and covariance structures, including multivariate extensions of repeated measures, time series, longitudinal, spatial and spatio-temporal structures. The package offers a user-friendly interface for fitting McGLMs similar to the glm() R function. See Bonat (2018) <doi:10.18637/jss.v084.i04>, for more information and examples. |
| Depends: | R (≥ 4.5.0) |
| Suggests: | testthat, knitr, rmarkdown, MASS, mvtnorm, tweedie, devtools |
| Imports: | stats, Matrix, assertthat, graphics, Rcpp (≥ 0.12.16) |
| License: | GPL-3 |
| LazyData: | TRUE |
| URL: | https://github.com/bonatwagner/mcglm |
| BugReports: | https://github.com/bonatwagner/mcglm/issues |
| Encoding: | UTF-8 |
| VignetteBuilder: | knitr |
| RoxygenNote: | 7.3.3 |
| LinkingTo: | Rcpp, RcppArmadillo |
| NeedsCompilation: | yes |
| Packaged: | 2026-01-07 11:59:24 UTC; wagner |
| Repository: | CRAN |
| Date/Publication: | 2026-01-08 19:20:07 UTC |
Generalized Error Sum of Squares
Description
Extract the generalized error sum of squares (ESS) for
objects of mcglm class.
Usage
ESS(object, verbose = TRUE)
Arguments
object |
an object or a list of objects representing a model
of |
verbose |
logical. Print or not the ESS value. |
Value
An invisible list with a single numeric component:
- ESS
The generalized error sum of squares, scaled by the residual degrees of freedom.
Author(s)
Wagner Hugo Bonat, wbonat@ufpr.br
Source
Bonat, W. H. (2018). Multiple Response Variables Regression Models in R: The mcglm Package. Journal of Statistical Software, 84(4):1–30.
Wang, M. (2014). Generalized Estimating Equations in Longitudinal Data Analysis: A Review and Recent Developments. Advances in Statistics, 1(1)1–13.
See Also
gof, plogLik, pAIC, pKLIC,
GOSHO and RJC.
Gosho Information Criterion
Description
Extract the Gosho Information Criterion (GOSHO)
for an object of mcglm class.
WARNING: This function is limited to models with ONE response variable.
This function is general useful only for longitudinal data analysis.
Usage
GOSHO(object, id, verbose = TRUE)
Arguments
object |
an object of |
id |
a vector which identifies the clusters or groups. The length and order of id should be the same as the number of observations. Data are assumed to be sorted so that observations on a cluster are contiguous rows for all entities in the formula. |
verbose |
logical. Print or not the GOSHO value. |
Value
An invisible list with a single numeric component:
- Gosho
The Gosho Information Criterion, computed for longitudinal data with a single response variable and scaled by the number of clusters.
Author(s)
Wagner Hugo Bonat, wbonat@ufpr.br
Source
Wang, M. (2014). Generalized Estimating Equations in Longitudinal Data Analysis: A Review and Recent Developments. Advances in Statistics, 1(1)1–13.
See Also
gof, plogLik, pAIC, pKLIC,
ESS and RJC.
Hunting Data from Pico Basile, Bioko Island, Equatorial Guinea
Description
This dataset contains a case study analyzed in Bonat et al. (2017) regarding animals hunted in the village of Basile Fang, Bioko Norte Province, Bioko Island, Equatorial Guinea. Monthly counts of blue duikers and other small animals shot or snared were collected from a random sample of 52 commercial hunters between August 2010 and September 2013. For each animal caught, the species, sex, capture method, and altitude were recorded. The dataset contains 1216 observations.
Usage
data(Hunting)
Format
A data.frame with 1216 observations and 11 variables:
ALTFactor with five levels indicating the altitude where the animal was caught.
SEXFactor with levels
FemaleandMale.METHODFactor with levels
EscopetaandTrampaindicating the method of capture.OTMonthly number of other small animals hunted.
BDMonthly number of blue duikers hunted.
OFFSETMonthly number of hunter days.
HUNTERHunter index.
MONTHMonth index.
MONTHCALENDARMonth as calendar number (1 = January, ..., 12 = December).
YEARCalendar year (2010–2013).
HUNTER.MONTHIndex indicating observations taken for the same hunter and month.
Source
Bonat, W. H., et al. (2017). "Modelling the covariance structure in marginal multivariate count models: Hunting in Bioko Island." Journal of Agricultural, Biological and Environmental Statistics, 22(4):446–464.
Bonat, W. H. (2018). "Multiple Response Variables Regression Models in R: The mcglm Package." Journal of Statistical Software, 84(4):1–30.
Examples
library(mcglm)
library(Matrix)
data(Hunting, package = "mcglm")
formu <- OT ~ METHOD*ALT + SEX + ALT*poly(MONTH, 4)
Z0 <- mc_id(Hunting)
Z1 <- mc_mixed(~0 + HUNTER.MONTH, data = Hunting)
fit <- mcglm(linear_pred = c(formu),
matrix_pred = list(c(Z0, Z1)),
link = c("log"),
variance = c("poisson_tweedie"),
power_fixed = c(FALSE),
control_algorithm = list(max_iter = 100),
offset = list(log(Hunting$OFFSET)),
data = Hunting)
summary(fit)
anova(fit)
Respiratory Physiotherapy on Premature Newborns
Description
The NewBorn dataset is from a prospective study assessing the effect of respiratory physiotherapy on cardiopulmonary function in ventilated preterm newborn infants with birth weight less than 1500 g. The dataset was collected by the nursing team of Waldemar Monastier Hospital, Campo Largo, PR, Brazil, and analyzed in Bonat and Jorgensen (2016) as an example of mixed outcomes regression models.
Usage
data(NewBorn)
Format
A data.frame with 270 observations and 21 variables:
SexFactor with levels
FemaleandMale.GAGestational age in weeks.
BWBirth weight in grams.
APGAR1MAPGAR index at the first minute of life.
APGAR5MAPGAR index at the fifth minute of life.
PREFactor indicating prematurity (YES/NO).
HDFactor indicating Hansen's disease (YES/NO).
SURFactor indicating surfactant administration (YES/NO).
JAUFactor indicating jaundice (YES/NO).
PNEFactor indicating pneumonia (YES/NO).
PDAFactor indicating persistence of ductus arteriosus (YES/NO).
PPIFactor indicating primary pulmonary infection (YES/NO).
OTHERSFactor indicating other diseases (YES/NO).
DAYSAge in days.
AUXFactor indicating type of respiratory auxiliary (HOOD/OTHERS).
RRRespiratory rate (continuous).
HRHeart rate (continuous).
SPO2Oxygen saturation (bounded).
TREATFactor with three levels: Respiratory physiotherapy, Evaluation 1, Evaluation 2, Evaluation 3.
NBINewborn index.
TIMEDays of treatment.
Source
Bonat, W. H. and Jorgensen, B. (2016). "Multivariate covariance generalized linear models." Journal of Royal Statistical Society, Series C, 65:649–675.
Examples
library(mcglm)
library(Matrix)
data(NewBorn, package = "mcglm")
# Linear predictor example
formu <- SPO2 ~ Sex + APGAR1M + APGAR5M + PRE + HD + SUR
Z0 <- mc_id(NewBorn)
fit <- mcglm(
linear_pred = c(formu),
matrix_pred = list(Z0),
link = "logit",
variance = "binomialP",
power_fixed = TRUE,
data = NewBorn,
control_algorithm = list(verbose = FALSE, tuning = 0.5)
)
summary(fit)
Rotnitzky–Jewell Information Criterion
Description
Computes the Rotnitzky–Jewell information criterion (RJC) for objects of
class mcglm. This criterion is based on quasi-likelihood theory
and is intended for model assessment in marginal models.
Usage
RJC(object, id, verbose = TRUE)
Arguments
object |
An object of class |
id |
An integer or factor vector identifying the clusters. Its length and ordering must match the number and ordering of the observations used to fit the model. |
verbose |
Logical. If |
Details
The RJC is defined using the sensitivity and variability structures of the estimating equations and measures the discrepancy between them. The implementation assumes that the data are correctly ordered such that observations belonging to the same cluster are stored in contiguous rows.
Warning: This function is restricted to models with a single response variable.
Value
An invisible list with a single component:
- RJC
A numeric scalar giving the value of the Rotnitzky–Jewell information criterion.
Author(s)
Wagner Hugo Bonat, wbonat@ufpr.br
Source
Wang, M. (2014). Generalized estimating equations in longitudinal data analysis: A review and recent developments. Advances in Statistics, 1(1), 1–13.
See Also
gof, plogLik, pAIC, pKLIC,
ESS, GOSHO
Australian Health Survey Data
Description
The Australian Health Survey (AHS) was used by Bonat and Jorgensen (2016) as an example of multivariate count regression modeling. The dataset contains five count response variables related to health system usage and nine covariates related to social conditions in Australia for the years 1987-88.
Usage
data(ahs)
Format
A data.frame with 5190 observations and 15 variables:
sexFactor with levels
maleandfemale.ageRespondent's age in years divided by 100.
incomeRespondent's annual income in Australian dollars divided by 1000.
levyplusFactor indicating coverage by private health insurance for private patients in public hospital with doctor of choice (1) or otherwise (0).
freepoorFactor indicating government coverage due to low income, recent immigration, or unemployment (1) or otherwise (0).
freerepaFactor indicating government coverage due to old-age/disability pension, veteran status, or family of deceased veteran (1) or otherwise (0).
illnesNumber of illnesses in the past two weeks, capped at 5.
actdaysNumber of days of reduced activity in the past two weeks due to illness or injury.
hscoreGeneral health questionnaire score (Goldberg's method); higher scores indicate poorer health.
chcondFactor with levels:
limited(chronic condition with activity limitation),nonlimited(chronic condition without limitation),otherwise(reference level).NdocNumber of consultations with a doctor or specialist (response variable).
NndocNumber of consultations with health professionals (response variable).
NadmNumber of admissions to hospital, psychiatric hospital, nursing, or convalescence home in the past 12 months (response variable).
NhospNumber of nights in a hospital during the most recent admission.
NmedTotal number of prescribed and non-prescribed medications used in the past two days.
Source
Deb, P. and Trivedi, P. K. (1997) "Demand for medical care by the elderly: A finite mixture approach." Journal of Applied Econometrics, 12(3):313–336.
Bonat, W. H. and Jorgensen, B. (2016) "Multivariate covariance generalized linear models." Journal of the Royal Statistical Society: Series C, 65:649–675.
Examples
library(mcglm)
data(ahs, package = "mcglm")
form1 <- Ndoc ~ income + age
form2 <- Nndoc ~ income + age
Z0 <- mc_id(ahs)
fit.ahs <- mcglm(linear_pred = c(form1, form2),
matrix_pred = list(Z0, Z0),
link = c("log", "log"),
variance = c("poisson_tweedie", "poisson_tweedie"),
data = ahs)
summary(fit.ahs)
Wald Tests for Fixed Effects in mcglm Models
Description
Performs Wald chi-square tests for assessing the significance of
fixed-effect terms in the linear predictors of an mcglm model.
The tests are conducted separately for each response variable and
are particularly useful for joint hypothesis testing of regression
coefficients associated with categorical covariates with more than
two levels. This function is not intended for model comparison.
Usage
## S3 method for class 'mcglm'
anova(object, ..., verbose = TRUE)
Arguments
object |
An object of class |
... |
Additional arguments. Currently ignored. |
verbose |
Logical indicating whether the Wald test results should be printed
to the console. If |
Details
The Wald tests are computed using the observed covariance matrix of the regression parameter estimates. For each response variable, joint tests are performed for sets of parameters corresponding to the same model term, as defined by the design matrix.
Value
A list of data frames, one for each response variable. Each data frame contains the results of Wald chi-square tests for the fixed-effect terms in the corresponding linear predictor, with the following columns:
- Covariate
Name of the covariate or model term tested.
- Chi.Square
Value of the Wald chi-square statistic.
- Df
Degrees of freedom associated with the test.
- p.value
P-value of the Wald test.
The returned object is invisible and is primarily intended for programmatic use.
Author(s)
Wagner Hugo Bonat, wbonat@ufpr.br
See Also
summary.mcglm, coef.mcglm,
vcov.mcglm
Examples
x1 <- seq(-1, 1, length.out = 100)
x2 <- gl(5, 20)
beta <- c(5, 0, -2, -1, 1, 2)
X <- model.matrix(~ x1 + x2)
set.seed(123)
y <- rnorm(100, mean = X %*% beta, sd = 1)
data <- data.frame(y = y, x1 = x1, x2 = x2)
fit <- mcglm(c(y ~ x1 + x2), list(mc_id(data)), data = data)
anova(fit)
Model Coefficients
Description
Extract regression, dispersion and correlation parameter estimates
from objects of class mcglm.
Usage
## S3 method for class 'mcglm'
coef(
object,
std.error = FALSE,
response = c(NA, 1:length(object$beta_names)),
type = c("beta", "tau", "power", "correlation"),
...
)
Arguments
object |
An object of class |
std.error |
Logical indicating whether standard errors should be returned
alongside the parameter estimates. Default is |
response |
Integer vector indicating for which response variables the
coefficients should be returned. If |
type |
Character vector specifying which type of coefficients should be
returned. Possible values are |
... |
Additional arguments. Currently ignored and included for
compatibility with the generic |
Value
A data.frame with one row per parameter, containing:
-
Estimates: parameter estimates; -
Std.error: standard errors (if requested); -
Parameters: parameter names; -
Type: parameter type; -
Response: response variable index.
Author(s)
Wagner Hugo Bonat, wbonat@ufpr.br
Confidence Intervals for Model Parameters
Description
Computes Wald-type confidence intervals for parameter estimates
from a fitted mcglm model, based on asymptotic normality.
Usage
## S3 method for class 'mcglm'
confint(object, parm, level = 0.95, ...)
Arguments
object |
A fitted object of class |
parm |
Optional specification of parameters for which confidence intervals are required. Can be a numeric vector of indices or a character vector of parameter names. If omitted, confidence intervals for all parameters are returned. |
level |
Numeric value giving the confidence level. Must be between 0 and 1.
Default is |
... |
Additional arguments. Currently ignored and included for
compatibility with the generic
|
Value
A numeric matrix with two columns corresponding to the lower and upper confidence limits. Rows correspond to model parameters.
Author(s)
Wagner Hugo Bonat, wbonat@ufpr.br
Cross variability matrix
Description
Compute the cross-covariance matrix between covariance and regression parameters. Equation (11) of Bonat and Jorgensen (2016).
Usage
covprod(A, res, W)
Arguments
A |
A matrix. |
res |
A vector of residuals. |
W |
A matrix of weights. The returned object is intended for internal use only. |
Author(s)
Wagner Hugo Bonat
Chaser and Reciprocal Likelihood Algorithms
Description
This function implements the two main algorithms used for fitting multivariate covariance generalized linear models. The chaser and the reciprocal likelihood algorithms.
Usage
fit_mcglm(list_initial, list_link, list_variance,
list_covariance, list_X, list_Z, list_offset,
list_Ntrial, list_power_fixed, list_sparse,
y_vec, correct, max_iter, tol, method,
tuning, verbose, weights)
Arguments
list_initial |
a list of initial values for regression and covariance parameters. |
list_link |
a list specifying the link function names. |
list_variance |
a list specifying the variance function names.
Options are: |
list_covariance |
a list of covariance function names. Options
are: |
list_X |
a list of design matrices.
See |
list_Z |
a list of knowm matrices to compose the matrix linear predictor. |
list_offset |
a list of offset values. Default NULL. |
list_Ntrial |
a list of number of trials, useful only when analysing binomial data. Default 1. |
list_power_fixed |
a list of logicals indicating if the power
parameters should be estimated or not. Default |
list_sparse |
a list of logicals indicating if the matrices should be set up as sparse matrices. This argument is useful only when using exponential-matrix covariance link function. In the other cases the algorithm detects automatically if the matrix should be sparse or not. |
y_vec |
a vector of the stacked response variables. |
correct |
a logical indicating if the algorithm will use the
correction term or not. Default |
max_iter |
maximum number of iterations. Default |
tol |
a numeric specyfing the tolerance. Default |
method |
a string specyfing the method used to fit the models
( |
tuning |
a numeric value in general close to zero for the rc
method and close to 1 for the chaser method. This argument control
the step-length. Default |
verbose |
a logical if TRUE print the values of the covariance
parameters used on each iteration. Default |
weights |
Vector of weights for model fitting. |
Value
A list with the results of the estimation procedure for multivariate covariance generalized linear models.
The object contains regression parameter estimates, covariance (dispersion) parameter estimates, fitted values, residuals and information about the iterative fitting process, such as the number of iterations, convergence status, sensitivity and variability matrices.
The returned object is intended for internal use. End users should
rely on the output provided by mcglm, which wraps this
function.
Author(s)
Wagner Hugo Bonat, wbonat@ufpr.br
Source
Bonat, W. H. and Jorgensen, B. (2016) Multivariate covariance generalized linear models. Journal of Royal Statistical Society - Series C 65:649–675.
Bonat, W. H. (2018). Multiple Response Variables Regression Models in R: The mcglm Package. Journal of Statistical Software, 84(4):1–30.
See Also
mcglm, mc_matrix_linear_predictor,
mc_link_function and
mc_variance_function.
Fitted Values
Description
Extract fitted (mean) values from a fitted mcglm model.
For multivariate responses, fitted values are returned in matrix
form, with one column per response variable.
Usage
## S3 method for class 'mcglm'
fitted(object, ...)
Arguments
object |
A fitted object of class |
... |
Additional arguments. Currently ignored and included for
compatibility with the generic
|
Value
A numeric matrix of fitted values. Rows correspond to observations and columns correspond to response variables.
Author(s)
Wagner Hugo Bonat, wbonat@ufpr.br
Measures of Goodness-of-Fit
Description
Extract the pseudo Gaussian log-likelihood (plogLik),
pseudo Akaike Information Criterion (pAIC), pseudo Kullback-Leibler
Information Criterion (pKLIC) and pseudo Bayesian Information Criterion (pBIC)
for objects of mcglm class.
Usage
gof(object)
Arguments
object |
an object or a list of objects representing a model
of |
Value
A data frame with the following columns:
- plogLik
Numeric value of the pseudo Gaussian log-likelihood.
- Df
Integer giving the number of estimated parameters.
- pAIC
Numeric value of the pseudo Akaike Information Criterion.
- pKLIC
Numeric value of the pseudo Kullback–Leibler Information Criterion.
- BIC
Numeric value of the pseudo Bayesian Information Criterion.
Author(s)
Wagner Hugo Bonat, wbonat@ufpr.br
Source
Bonat, W. H. (2018). Multiple Response Variables Regression Models in R: The mcglm Package. Journal of Statistical Software, 84(4):1–30.
Wang, M. (2014). Generalized Estimating Equations in Longitudinal Data Analysis: A Review and Recent Developments. Advances in Statistics, 1(1)1–13.
See Also
plogLik, pAIC, pKLIC and pBIC.
Wald Tests for Dispersion Components
Description
Performs Wald chi-square tests for dispersion (covariance) parameters
by response variable in multivariate covariance generalized linear
models fitted with mcglm. This function is intended for
joint hypothesis testing of dispersion coefficients associated with
categorical covariates with more than two levels. It is not designed
for model comparison.
Usage
mc_anova_disp(object, idx_list, names_list, ...)
Arguments
object |
An object of class |
idx_list |
A list of integer vectors indexing dispersion parameters to be jointly tested for each response. |
names_list |
A list of character vectors with covariate names to be displayed in the output tables. |
... |
Currently not used. |
Value
The object is a list of data frames, one per response variable. Each data frame contains the following columns:
- Covariate
Name of the covariate associated with the dispersion parameters being tested.
- Chi.Square
Wald chi-square test statistic.
- Df
Degrees of freedom of the test.
- p.value
P-value associated with the chi-square test.
See Also
Bias-corrected Standard Error for Regression Parameters
Description
Compute bias-corrected standard error for regression
parameters in the context of clustered observations for an
object of mcglm class. It is also robust and has improved
finite sample properties.
Usage
mc_bias_corrected_std(object, id)
Arguments
object |
an object of |
id |
a vector which identifies the clusters. The length and
order of |
Value
A list with two elements. A vector of standard error and a variance-covariance matrix computed based on a bias corrected approach as described in the reference.
Author(s)
Wagner Hugo Bonat, wbonat@ufpr.br
Source
Nuamah, I. F. and Qu, Y. and Aminu, S. B. (1996). A SAS macro for stepwise correlated binary regression. Computer Methods and Programs in Biomedicine 49, 199–210.
See Also
mc_robust_std.
Build the joint covariance matrix
Description
This function builds the joint variance-covariance matrix using the Generalized Kronecker product and its derivatives with respect to rho, power and tau parameters.
Usage
mc_build_C(
list_mu,
list_Ntrial,
rho,
list_tau,
list_power,
list_Z,
list_sparse,
list_variance,
list_covariance,
list_power_fixed,
compute_C = FALSE,
compute_derivative_beta = FALSE,
compute_derivative_cov = TRUE
)
Arguments
list_mu |
A list with values of the mean. |
list_Ntrial |
A list with the number of trials. Usefull only for binomial responses. |
rho |
Vector of correlation parameters. |
list_tau |
A list with values for the tau parameters. |
list_power |
A list with values for the power parameters. |
list_Z |
A list of matrix to be used in the matrix linear predictor. |
list_sparse |
A list with Logical. |
list_variance |
A list specifying the variance function to be used for each response variable. |
list_covariance |
A list specifying the covariance function to be used for each response variable. |
list_power_fixed |
A list of Logical specifying if the power parameters are fixed or not. |
compute_C |
Logical. Compute or not the C matrix. |
compute_derivative_beta |
Logical. Compute or not the derivative of C with respect to regression parameters. |
compute_derivative_cov |
Logical. Compute or not the derivative of C with respect the covariance parameters. |
Value
A list of matrices. This function return the variance-covariance matrix C or its inverse along with its respective matrices of derivatives with respect to rho, power and tau parameters. The returned object is intended for internal use only.
Author(s)
Wagner Hugo Bonat
Auxiliar function: Build F matrix for Wald multivariate test
Description
The function mc_build_F is just an auxiliar
function to construct the design matrix for multivariate Wald-type
test.
Usage
mc_build_F(vector)
Arguments
vector |
A vector indexing model regression coefficients. |
Value
A matrix. The returned object is intended for internal use only.
Author(s)
Wagner Hugo Bonat
Build a block-diagonal matrix of zeros.
Description
Build a block-diagonal matrix of zeros. Such functions is used when computing the derivatives of the Cholesky decomposition of C.
Usage
mc_build_bdiag(n_resp, n_obs)
Arguments
n_resp |
A numeric specifyng the number of response variables. |
n_obs |
A numeric specifying the number of observations in the data set. |
Details
It is an internal function.
Value
A list of matrices with elements equal to zero. The returned object is intended for internal use only.
Author(s)
Wagner Hugo Bonat
Build omega matrix
Description
This function build \Omega matrix according the
covariance link function.
Usage
mc_build_omega(tau, Z, covariance_link, sparse = FALSE)
Arguments
tau |
A vector |
Z |
A list of matrices. |
covariance_link |
String specifing the covariance link function: identity, inverse, expm. |
sparse |
Logical force to use sparse matrix representation 'dsCMatrix'. |
Value
A list of matrices. The function returns the \Omega matrix its
inverse and derivatives with respect to \tau.
The returned object is intended for internal use only.
Author(s)
Wagner Hugo Bonat
Build variance-covariance matrix
Description
This function builds a variance-covariance matrix, based on the variance function and omega matrix.
Usage
mc_build_sigma(
mu,
Ntrial = 1,
tau,
power,
Z,
sparse,
variance,
covariance,
power_fixed,
compute_derivative_beta = FALSE
)
Arguments
mu |
A numeric vector. In general the output from
|
Ntrial |
A numeric vector, or NULL or a numeric specifing the number of trials in the binomial experiment. It is usefull only when using variance = binomialP or binomialPQ. In the other cases it will be ignored. |
tau |
A numeric vector. |
power |
A numeric or numeric vector. It should be one number for all variance functions except binomialPQ, in that case the argument specifies both p and q. |
Z |
A list of matrices. |
sparse |
Logical. |
variance |
String specifing the variance function: constant, tweedie, poisson_tweedie, binomialP or binomialPQ. |
covariance |
String specifing the covariance function: identity, inverse or expm. |
power_fixed |
Logical if the power parameter is fixed at initial value (TRUE). In the case power_fixed = FALSE the power parameter will be estimated. |
compute_derivative_beta |
Logical. Compute or not the derivative with respect to regression parameters. |
Value
A list of matrices. The function returns a list of matrices with the
Cholesky decomposition of \Sigma, \Sigma^{-1} and the derivative
of \Sigma with respect to the power and tau parameters.
The returned object is intended for internal use only.
Author(s)
Wagner Hugo Bonat
See Also
mc_link_function,
mc_variance_function, mc_build_omega.
Build the correlation matrix between response variables
Description
This function builds the correlation matrix between response variable, its inverse and derivatives.
Usage
mc_build_sigma_between(rho, n_resp, inverse = FALSE)
mc_derivative_sigma_between(n_resp)
Arguments
rho |
A numeric vector. |
n_resp |
A numeric. |
inverse |
Logical. |
Value
A list of matrices. The function returns a list of matrices with sigmab
and its derivatives with respect to rho. The returned object is intended for internal use only.
Author(s)
Wagner Hugo Bonat
Conditional Autoregressive Model Structure
Description
The function mc_car helps to build the components
of the matrix linear predictor used for fitting conditional
autoregressive models. This function is used in general for fitting
spatial areal data using the well known conditional autoregressive
models (CAR). This function depends on a list of neighboors, such a
list can be constructed, for example using the
tri2nb function from the spdep package
based on spatial coordinates. This way to specify the matrix linear
predictor can also be applied for spatial continuous data,
as an approximation.
Usage
mc_car(list_neigh, intrinsic = FALSE)
Arguments
list_neigh |
list of neighboors. |
intrinsic |
logical. |
Value
A list of a matrix (intrinsic = TRUE) or two matrices
(intrinsic = FALSE). The main use of these matrices are as input in the
mcglm function as linear covariance models in the argument matrix_pred.
Author(s)
Wagner Hugo Bonat, wbonat@ufpr.br
Source
Bonat, W. H. (2018). Multiple Response Variables Regression Models in R: The mcglm Package. Journal of Statistical Software, 84(4):1–30.
See Also
mc_id, mc_compute_rho, mc_conditional_test,
mc_dist, mc_ma, mc_rw
and mc_mixed.
Complete Data Set with NA
Description
The function mc_complete_data completes a data
set with NA values for helping to construct the components of the
matrix linear predictor in models that require equal number of
observations by subjects (id).
Usage
mc_complete_data(data, id, index, id.exp)
Arguments
data |
a data.frame to be completed with NA. |
id |
name of the column (string) containing the subject id. |
index |
name of the column (string) containing the index to be completed. |
id.exp |
how the index is expected to be for all subjects. |
Value
A data.frame with the same number of observations by subject. It is intended as a helper function to build the linear matrix predictor for models that require the same number of observations by subjects.
Author(s)
Wagner Hugo Bonat, wbonat@ufpr.br
Source
Bonat, W. H. (2018). Multiple Response Variables Regression Models in R: The mcglm Package. Journal of Statistical Software, 84(4):1–30.
See Also
mc_dglm, mc_ns, mc_ma and mc_rw.
Autocorrelation Estimates
Description
Compute autocorrelation estimates based on a fitted model
using the mc_car structure. The mcglm approach fits
models using a linear covariance structure, but in general in this
parametrization for spatial models the parameters have no simple
interpretation in terms of spatial autocorrelation.
The function mc_compute_rho computes the autocorrelation
based on a fitted model.
Usage
mc_compute_rho(object, level = 0.975)
Arguments
object |
an object or a list of objects representing a model
of |
level |
the confidence level required. |
Value
Returns a data frame with parameter estimate, standard error and
confidential interval for the spatial autocorrelation parameter. It is used
in the case of spatial models using the mc_car specification.
Author(s)
Wagner Hugo Bonat, wbonat@ufpr.br
Source
Bonat, W. H. (2018). Multiple Response Variables Regression Models in R: The mcglm Package. Journal of Statistical Software, 84(4):1–30.
See Also
mc_car and mc_conditional_test.
Conditional Hypotheses Tests
Description
Compute conditional hypotheses tests for fitted
mcglm model class.
When fitting models with extra power parameters, the standard errors
associated with the dispersion parameters can be large. In that cases,
we suggest to conduct conditional hypotheses test instead of the
orthodox marginal test for the dispersion parameters.
The function mc_conditional_test offers an ease way to
conduct such conditional test. Furthermore, the function is quite
flexible and can be used for any other conditional hypotheses test.
Usage
mc_conditional_test(object, parameters, test, fixed)
Arguments
object |
an object representing a model of |
parameters |
which parameters will be included in the conditional test. |
test |
index indicating which parameters will be tested given the values of the other parameters. |
fixed |
index indicating which parameters should be fixed on the conditional test. |
Value
Returns a data frame with parameter estimates, conditional standard errors and Z-statistics. It is used to compute conditional hypothesis tests in mcglm fitted models.
Author(s)
Wagner Hugo Bonat, wbonat@ufpr.br
Source
Bonat, W. H. (2018). Multiple Response Variables Regression Models in R: The mcglm Package. Journal of Statistical Software, 84(4):1–30.
Core of the Pearson estimating function.
Description
Core of the Pearson estimating function.
Usage
mc_core_pearson(product, inv_C, res, W)
Arguments
product |
A matrix. |
inv_C |
A matrix. |
res |
A vector of residuals. |
W |
Matrix of weights. |
Details
It is an internal function.
Value
A vector. The returned object is intended for internal use only.
Author(s)
Wagner Hugo Bonat
Pearson correction term
Description
Compute the correction term associated with the Pearson estimating function.
Usage
mc_correction(D_C, inv_J_beta, D, inv_C)
Arguments
D_C |
A list of matrices. |
inv_J_beta |
A matrix. In general it is computed based on the
output of the |
D |
A matrix. In general it is the output of the mc_link_function. |
inv_C |
A matrix. In general the output of the mc_build_C. |
Details
It is an internal function useful inside the fitting algorithm.
Value
A vector with the correction terms to be used on the Pearson estimating function. The returned object is intended for internal use only.
Author(s)
Wagner Hugo Bonat
Cross-sensitivity
Description
Compute the cross-sensitivity matrix between regression and covariance parameters. Equation 10 of Bonat and Jorgensen (2015).
Usage
mc_cross_sensitivity(
Product_cov,
Product_beta,
n_beta_effective = length(Product_beta)
)
Arguments
Product_cov |
A list of matrices. |
Product_beta |
A list of matrices. |
n_beta_effective |
Numeric. Effective number of regression parameters. |
Value
The cross-sensitivity matrix. Equation (10) of Bonat and Jorgensen (2016). The returned object is intended for internal use only.
Author(s)
Wagner Hugo Bonat
Compute the cross-variability matrix
Description
Compute the cross-variability matrix between covariance and regression parameters.
Usage
mc_cross_variability(Product_cov, inv_C, res, D)
Arguments
Product_cov |
A list of matrices. |
inv_C |
A matrix. |
res |
A vector. |
D |
A matrix. |
Value
The cross-variability matrix between regression and covariance parameters. The returned object is intended for internal use only.
Author(s)
Wagner Hugo Bonat
Derivative of C with respect to rho.
Description
Compute the derivative of the C matrix with respect to the correlation parameters rho.
Usage
mc_derivative_C_rho(
D_Sigmab,
Bdiag_chol_Sigma_within,
t_Bdiag_chol_Sigma_within,
II
)
Arguments
D_Sigmab |
A matrix. |
Bdiag_chol_Sigma_within |
A block-diagonal matrix. |
t_Bdiag_chol_Sigma_within |
A block-diagonal matrix. |
II |
A diagonal matrix. |
Details
It is an internal function used to build the derivatives of the C matrix.
Value
A list of matrices. The function returns a list of matrices corresponding to the derivatives of the variance-covariance matrix with respect to the correlations parameters rho. The returned object is intended for internal use only.
Author(s)
Wagner Hugo Bonat
Derivatives of the Cholesky decomposition
Description
This function compute the derivative of the Cholesky decomposition.
Usage
mc_derivative_cholesky(derivada, inv_chol_Sigma, chol_Sigma)
Arguments
derivada |
A matrix. |
inv_chol_Sigma |
A matrix. |
chol_Sigma |
A matrix. |
Details
It is an internal function.
Value
A list of matrix. The matrices are the derivative of the Cholesky decomposition of Sigma or its inverse with respect to the dispersion parameters. The returned object is intended for internal use only.
Author(s)
Wagner Hugo Bonat
Derivative of exponential-matrix function
Description
Compute the derivative of the exponential-matrix covariance link function.
Usage
mc_derivative_expm(dU, UU, inv_UU, Q, n = dim(UU)[1], sparse = FALSE)
Arguments
dU |
A matrix. |
UU |
A matrix. |
inv_UU |
A matrix. |
Q |
A numeric vector. |
n |
A numeric. |
sparse |
Logical. |
Details
Many arguments required by this function are provide by the
link[mcglm]{mc_dexpm}. The argument dU is the derivative
of the U matrix with respect to the models parameters. It should
be computed by the user.
Value
A matrix with the derivative of the exponential matrix covariance link function. The returned object is intended for internal use only.
Author(s)
Wagner Hugo Bonat
See Also
expm, link[mcglm]{mc_dexp_gold}
and link[mcglm]{mc_dexpm}.
Derivatives of V ^{1/2} with respect to beta.
Description
Compute the derivatives of V^{1/2} matrix with
respect to the regression parameters beta.
Usage
mc_derivative_sigma_beta(D, D_V_sqrt_mu, Omega, V_sqrt, variance)
Arguments
D |
A matrix. |
D_V_sqrt_mu |
A matrix. |
Omega |
A matrix. |
V_sqrt |
A matrix. |
variance |
A string specifying the variance function name. |
Value
A list of matrices, with the derivatives of V^{1/2}
with respect to the regression parameters.
The returned object is intended for internal use only.
Author(s)
Wagner Hugo Bonat
Exponential-matrix and its derivatives
Description
Given a matrix M and its derivative dM the
function dexp_gold returns the exponential-matrix
expm(M) and its derivative. This function is based on the
expm function. It is not really used in the
package, but I keep this function to test my own implementation
based on eigen values decomposition.
Usage
mc_dexp_gold(M, dM)
Arguments
M |
A matrix. |
dM |
A matrix. |
Value
A list with two elements: expm(M) and its derivatives.
The returned object is intended for internal use only.
Author(s)
Wagner Hugo Bonat
See Also
Examples
M <- matrix(c(1,0.8,0.8,1), 2,2)
dM <- matrix(c(0,1,1,0),2,2)
mcglm::mc_dexp_gold(M = M, dM = dM)
Double Generalized Linear Models Structure
Description
The function mc_dglm builds the components
of the matrix linear predictor used for fitting double generalized
linear models.
Usage
mc_dglm(formula, id, data)
Arguments
formula |
a formula spefying the components of the covariance structure. |
id |
name of the column (string) containing the subject index. (If ts is not repeated measures, use id = 1 for all observations). |
data |
data set. |
Value
A list containing diagonal matrices with entries defined by the
covariates assumed to describe the matrix linear predictor.
Each matrix corresponds to one component of the covariance model and
is intended to be supplied to the matrix_pred argument of
mcglm.
Author(s)
Wagner Hugo Bonat, wbonat@ufpr.br
Source
Bonat, W. H. (2018). Multiple Response Variables Regression Models in R: The mcglm Package. Journal of Statistical Software, 84(4):1–30.
See Also
mc_id, mc_dist, mc_ma, mc_rw
and mc_mixed.
Distance Models Structure
Description
The function mc_dist helps to build the components
of the matrix linear predictor using matrices based on distances.
This function is generaly used for the analysis of longitudinal and
spatial data. The idea is to use the inverse of some measure of distance
as for example the Euclidean distance to model the covariance structure
within response variables. The model can also use the inverse of
distance squared or high order power.
Usage
mc_dist(id, time, data, method = "euclidean")
Arguments
id |
name of the column (string) containing the subject index.
For spatial data use the same |
time |
name of the column (string) containing the index indicating the time. For spatial data use the same index for all observations. |
data |
data set. |
method |
distance measure to be used. |
Details
The distance measure must be one of "euclidean",
"maximum", "manhattan", "canberra",
"binary" or "minkowski". This function is a customize
call of the dist function.
Value
A list containing a sparse matrix of class dgCMatrix.
This matrix represents the design matrix for the linear predictor and
is intended to be supplied to the matrix_pred argument of
mcglm.
Author(s)
Wagner Hugo Bonat, wbonat@ufpr.br
Source
Bonat, W. H. (2018). Multiple Response Variables Regression Models in R: The mcglm Package. Journal of Statistical Software, 84(4):1–30.
See Also
dist, mc_id,
mc_conditional_test, mc_car, mc_ma,
mc_rw and mc_mixed.
Examples
id <- rep(1:2, each = 4)
time <- rep(1:4, 2)
data <- data.frame("id" = id, "time" = time)
mc_dist(id = "id", time = "time", data = data)
mc_dist(id = "id", time = "time", data = data, method = "canberra")
Exponential-matrix covariance link function
Description
Given a matrix U the function mc_expm
returns the exponential-matrix expm(U) and some auxiliares
matrices to compute its derivatives. This function is based on
the eigen-value decomposition it means that it is very slow.
Usage
mc_expm(U, n = dim(U)[1], sparse = FALSE, inverse = FALSE)
Arguments
U |
A matrix. |
n |
A number specifing the dimension of the matrix U. Default
|
sparse |
Logical defining the class of the output matrix. If
|
inverse |
Logical defining if the inverse will be computed or not. |
Value
A list containing \Omega = expm(U), its inverse (if
inverse = TRUE), and auxiliary matrices used to compute the
derivatives. The returned object is intended for internal use only.
Author(s)
Wagner Hugo Bonat
See Also
expm, eigen,
link[mcglm]{mc_dexp_gold}.
Getting information about model parameters
Description
This computes all information required about the number of model parameters.
Usage
mc_getInformation(list_initial, list_power_fixed, n_resp)
Arguments
list_initial |
A list of initial values. |
list_power_fixed |
A list of logical specyfing if the power parameters should be estimated or not. |
n_resp |
A number specyfing the nmber of response variables. |
Value
A list containing information about the number of model parameters:
- n_betas
A list with the number of regression parameters (
\beta) for each response variable.- n_taus
A list with the number of dispersion parameters (
\tau) for each response variable.- n_power
A list with the number of power parameters for each response variable, accounting for fixed parameters.
- n_rho
An integer giving the number of correlation parameters.
- n_cov
An integer giving the total number of covariance-related parameters, including power, dispersion, and correlation parameters.
Author(s)
Wagner Hugo Bonat
Independent Model Structure
Description
Builds an identity matrix to be used as a component of the matrix linear predictor. It is in general the first component of the matrix linear predictor, a kind of intercept matrix.
Usage
mc_id(data)
Arguments
data |
the data set to be used. |
Value
A list with a single component:
- Z0
A sparse identity matrix of class
ddiMatrixwith dimension equal to the number of observations, representing the intercept component of the matrix linear predictor.
It is intended to be supplied to the matrix_pred argument of
mcglm.
Author(s)
Wagner Hugo Bonat, wbonat@ufpr.br
Source
Bonat, W. H. (2018). Multiple Response Variables Regression Models in R: The mcglm Package. Journal of Statistical Software, 84(4):1–30.
See Also
mc_dist, mc_ma, mc_rw and mc_mixed.
Automatic Initial Values
Description
This function provides initial values to be used when
fitting multivariate covariance generalized linear models by using
the function mcglm. In general the users do not need to use
this function, since it is already employed when setting the argument
control_initial = "automatic" in the mcglm function.
However, if the users want to change some of the initial values,
this function can be useful.
Usage
mc_initial_values(linear_pred, matrix_pred, link, variance,
covariance, offset, Ntrial, contrasts, data)
Arguments
linear_pred |
a list of formula see |
matrix_pred |
a list of known matrices to be used on the matrix
linear predictor. |
link |
a list of link functions names, see
|
variance |
a list of variance functions names, see
|
covariance |
a list of covariance link functions names, see
|
offset |
a list of offset values if any. |
Ntrial |
a list of the number of trials on Bernoulli
experiments. It is useful only for |
contrasts |
list of contrasts to be used in the
|
data |
data frame. |
Details
To obtain initial values for multivariate covariance
generalized linear models the function
mc_initial_values fits
a generalized linear model (GLM) using the function glm with
the specified linear predictor and link function for each response
variables considering independent observations. The family
argument is always specified as quasi. The link function depends
on the specification of the argument link.
The variance function is always specified as "mu" the only
excession appears when using variance = "constant" then the
family argument in the glm function is specified as
quasi(link = link, variance = "constant"). The estimated value
of the dispersion parameter from the glm function is used as
initial value for the first component of the matrix linear predictor,
for all other components the value zero is used.
For the
cases covariance = "inverse" and covariance = "expm"
the inverse and the logarithm of the estimated dispersion parameter
is used as initial value for the first component of the matrix linear
predictor. The value of the power parameter is always started at 1.
In the cases of multivariate models the correlation between response
variables is always started at 0.
Value
A list containing initial values for model parameters:
- regression
A list of numeric vectors with initial values for the regression parameters of each response variable.
- power
A list of numeric vectors with initial values for the power parameters associated with the variance functions.
- tau
A list of numeric vectors with initial values for the dispersion and covariance-related parameters in the matrix linear predictor.
- rho
A numeric vector with initial values for the correlation parameters between response variables.
Author(s)
Wagner Hugo Bonat, wbonat@ufpr.br
Link Functions
Description
The mc_link_function is a customized call of the
make.link function.
Given the name of a link function, it returns a list with two
elements. The first element is the inverse of the link function
applied on the linear predictor \mu = g^{-1}(X\beta). The
second element is the derivative of \mu with respect to the
regression parameters \beta.
It will be useful when computing the quasi-score function.
Usage
mc_link_function(beta, X, offset, link)
mc_logit(beta, X, offset)
mc_probit(beta, X, offset)
mc_cauchit(beta, X, offset)
mc_cloglog(beta, X, offset)
mc_loglog(beta, X, offset)
mc_identity(beta, X, offset)
mc_log(beta, X, offset)
mc_sqrt(beta, X, offset)
mc_invmu2(beta, X, offset)
mc_inverse(beta, X, offset)
Arguments
beta |
a numeric vector of regression parameters. |
X |
a design matrix, see |
offset |
a numeric vector of offset values. It will be sum up on
the linear predictor as a covariate with known regression
parameter equals one ( |
link |
a string specifying the name of the link function.
Options are: |
Details
The link function is an important component of the
multivariate covariance generalized linear models, since it links
the expectation of the response variable with the covariates.
Let \beta be a (p x 1) regression parameter vector and
X be an (n x p) design matrix. The expected value of
the response variable Y is given by
E(Y) =
g^{-1}(X\beta),
where g is the link function and \eta
= X\beta is the linear predictor. Let D be a (n x p)
matrix whose entries are given by the derivatives of \mu
with respect to \beta. Such a matrix will be required for the
fitting algorithm. The function mc_link_function returns a
list where the first element is \mu (n x 1) vector
and the second is the D (n x p) matrix.
A user defined function can also be used. It must be a function
with arguments beta, X and offset
(set to NULL if non needed). The function must return a
length 2 named list with mu and D elements as a
vector and a matrix of proper dimensions.
Value
A list with the following components:
- mu
A numeric vector of length
ncontaining the mean response values obtained by applying the inverse link function to the linear predictor.- D
A numeric matrix of dimension
n \times pcontaining the derivatives of\muwith respect to the regression parameters\beta.
Author(s)
Wagner Hugo Bonat, wbonat@ufpr.br
See Also
Examples
x1 <- seq(-1, 1, l = 5)
X <- model.matrix(~ x1)
mc_link_function(beta = c(1,0.5), X = X,
offset = NULL, link = 'log')
mc_link_function(beta = c(1,0.5), X = X,
offset = rep(10,5), link = 'identity')
Auxiliar function transforms list to a vector.
Description
This function takes a list of parameters and tranforms to a vector.
Usage
mc_list2vec(list_initial, list_power_fixed)
Arguments
list_initial |
A list specifying initial values. |
list_power_fixed |
A list of logical operators specyfing if the power parameter should be estimated or not. |
Details
It is an internal function, in general the users never will use this function. It will be useful, only if the user wants to implement a different variance-covariance matrix.
Value
A list with the following components:
- beta_ini
A numeric vector containing the initial values of the regression parameters stacked across all response variables.
- cov_ini
A numeric vector containing the initial values of the covariance-related parameters, including dispersion, power, and correlation parameters, after removing fixed power parameters.
Author(s)
Wagner Hugo Bonat
Moving Average Model Structure
Description
Builds components of the matrix linear predictor associated with moving average (MA) covariance structures. This function is mainly intended for longitudinal data analysis, but can also be used for time series data
Usage
mc_ma(id, time, data, order = 1)
Arguments
id |
name of the column (string) containing the subject index.
Note that this structure was designed to deal with longitudinal data.
For times series data use the same |
time |
name of the column (string) containing the index indicating the time. |
data |
data set. |
order |
An integer specifying the order of the moving average process. |
Details
This function was primarily designed for longitudinal data,
but it can also be used for time series analysis. In this case, the
id argument should contain a single identifier, representing
one observational unit. Internally, the function constructs block-diagonal
band matrices using bandSparse.
Value
A list with the following component:
- Z1
A sparse matrix of class
nsCMatrixrepresenting the moving average component of the matrix linear predictor. The matrix has dimension equal to the total number of observations and is constructed as a block-diagonal matrix, with one block per subject (or time series), each block encoding a moving average structure of the specified order.
Author(s)
Wagner Hugo Bonat, wbonat@ufpr.br
Source
Bonat, W. H. (2018). Multiple Response Variables Regression Models in R: The mcglm Package. Journal of Statistical Software, 84(4):1–30.
See Also
mc_id, mc_dist, mc_car,
mc_rw and mc_mixed.
Examples
id <- rep(1:2, each = 4)
time <- rep(1:4, 2)
data <- data.frame("id" = id, "time" = time)
mc_ma(id = "id", time = "time", data = data, order = 1)
mc_ma(id = "id", time = "time", data = data, order = 2)
MANOVA-Type Test for Multivariate Covariance GLMs
Description
Performs a MANOVA-type Wald test for multivariate covariance
generalized linear models fitted using mcglm.
The test is based on quadratic forms of the estimated regression
parameters and their covariance matrix, yielding statistics
analogous to the Hotelling–Lawley trace.
Usage
mc_manova(object, ...)
Arguments
object |
An object of class |
... |
Further arguments (currently not used). |
Value
A data frame containing the MANOVA-type test results with the following columns:
- Effects
Names of the tested model effects.
- Df
Degrees of freedom associated with each effect.
- Hotelling-Lawley
Hotelling–Lawley trace statistic.
- Chi-square
Chi-square test statistic.
- p-value
P-values from the chi-square approximation.
Author(s)
Wagner Hugo Bonat, wbonat@ufpr.br
See Also
MANOVA-Type Test for Dispersion Components of mcglm Models
Description
Performs a MANOVA-type Wald test for the dispersion parameters
of multivariate covariance generalized linear models fitted
using mcglm. The test is based on quadratic forms
of the estimated dispersion parameters and their covariance
matrix, yielding statistics analogous to the Hotelling–Lawley trace.
Usage
mc_manova_disp(object, idx, effect_names, ...)
Arguments
object |
An object of class |
idx |
An integer vector defining the grouping structure of dispersion parameters to be tested. |
effect_names |
A character vector with labels for the tested dispersion effects. |
... |
Further arguments (currently not used). |
Value
A data frame containing the MANOVA-type test results for the dispersion parameters with the following columns:
- Effects
Names of the tested dispersion effects.
- Df
Degrees of freedom associated with each effect.
- Hotelling-Lawley
Hotelling–Lawley trace statistic.
- Chi-square
Chi-square test statistic.
- p-value
P-values from the chi-square approximation.
Author(s)
Wagner Hugo Bonat, wbonat@ufpr.br
See Also
mcglm, mc_manova,
coef.mcglm, vcov.mcglm
Matrix Linear Predictor
Description
Computes the matrix linear predictor used in multivariate covariance generalized linear models. The matrix linear predictor is defined as a linear combination of known matrices weighted by dispersion parameters.
Usage
mc_matrix_linear_predictor(tau, Z)
Arguments
tau |
A numeric vector of dispersion parameters. |
Z |
A list of known matrices with compatible dimensions. |
Details
Given a list of known matrices (Z_1, \ldots, Z_D) and a vector
of dispersion parameters (\tau_1, \ldots, \tau_D), this function
computes their weighted sum. This object is typically used as a
component of the matrix linear predictor in covariance modeling.
Value
A matrix of class Matrix representing the
matrix linear predictor
U = \tau_1 Z_1 + \cdots + \tau_D Z_D.
The returned matrix has the same dimensions as the elements of Z.
The returned object is intended for internal use only.
Author(s)
Wagner Hugo Bonat, wbonat@ufpr.br
Source
Bonat, W. H. (2018). Multiple Response Variables Regression Models in R: The mcglm Package. Journal of Statistical Software, 84(4):1–30.
Bonat, W. H. and Jorgensen, B. (2016). Multivariate covariance generalized linear models. Journal of the Royal Statistical Society: Series C, 65:649–675.
See Also
mc_id, mc_dist, mc_ma, mc_rw,
mc_mixed, mc_car
Examples
Z0 <- Matrix::Diagonal(5, 1)
Z1 <- Matrix::Matrix(rep(1, 5) %*% t(rep(1, 5)))
Z <- list(Z0, Z1)
mc_matrix_linear_predictor(tau = c(1, 0.8), Z = Z)
Mixed Models Structure
Description
Constructs the components of the matrix linear predictor associated with mixed-effects covariance structures in multivariate covariance generalized linear models. The function builds symmetric matrices representing variance and covariance components as functions of known covariates, following a linear mixed model formulation.
The mc_mixed function is primarily intended for repeated measures
and longitudinal data, where observations are collected within a fixed
number of groups, subjects, or experimental units.
Usage
mc_mixed(formula, data)
Arguments
formula |
A model formula specifying the structure of the matrix
linear predictor for the dispersion component. The first term must
remove the intercept ( |
data |
A |
Details
The formula argument follows a syntax similar to that used for
linear mixed models. The grouping variable must be provided as the
second term in the formula and must be a factor; no internal
coercion is performed. Covariates specified after the slash
(/) may be continuous or categorical and define additional
variance and covariance components. When only the grouping variable
is specified (e.g., ~ 0 + SUBJECT), the resulting structure
corresponds to the compound symmetry covariance model.
By default, all pairwise interaction terms between components are included in the matrix linear predictor. Interaction terms may be excluded by removing the corresponding components from the resulting list.
Value
A list of symmetric sparse matrices of class "dsCMatrix", each
corresponding to a variance or covariance component of the matrix
linear predictor for the dispersion structure. The list includes
matrices associated with main effects and, by default, their pairwise
interaction terms as implied by the mixed-effects specification in the
formula. These matrices are used internally to construct the linear
predictor of the covariance model in mcglm. It is intended to be
supplied to the matrix_pred argument of mcglm.
Author(s)
Wagner Hugo Bonat, wbonat@ufpr.br
Source
Bonat, W. H. (2018). Multiple Response Variables Regression Models in R: The mcglm Package. Journal of Statistical Software, 84(4), 1–30.
Bonat, W. H., et al. (2016). Modelling the covariance structure in marginal multivariate count models: Hunting in Bioko Island. Journal of Agricultural, Biological, and Environmental Statistics, 22(4), 446–464.
See Also
mc_id, mc_conditional_test, mc_dist,
mc_ma, mc_rw, mc_car
Examples
SUBJECT <- gl(2, 6)
x1 <- rep(1:6, 2)
x2 <- rep(gl(2, 3), 2)
data <- data.frame(SUBJECT, x1, x2)
# Compound symmetry structure
mc_mixed(~ 0 + SUBJECT, data = data)
# Compound symmetry with random slope for x1
mc_mixed(~ 0 + SUBJECT/x1, data = data)
# Compound symmetry with random slopes for x1 and x2 and interactions
mc_mixed(~ 0 + SUBJECT/(x1 + x2), data = data)
Non-structured Covariance Model
Description
Constructs the components of the matrix linear predictor associated with a fully non-structured covariance model in multivariate covariance generalized linear models. This specification allows each pair of observations within a unit to have its own covariance parameter, resulting in a highly flexible but parameter-intensive model.
Due to the quadratic growth in the number of parameters, this structure is typically suitable only for datasets with a small number of repeated measurements per unit.
Usage
mc_ns(id, data, group = NULL, marca = NULL)
Arguments
id |
A character string giving the name of the column in
|
data |
A |
group |
An optional character string giving the name of a column in
|
marca |
An optional character string specifying the level of
|
Details
The function requires a balanced design, meaning that all units
identified by id must have the same number of observations.
An error is raised otherwise. When group and marca are
provided, covariance components are generated only for units not
belonging to the specified level marca; for those units, the
corresponding blocks are set to zero.
Value
A list of symmetric block-diagonal matrices, each representing one
covariance component of the non-structured matrix linear predictor.
The length of the list is equal to n(n - 1) / 2, where n is
the number of observations per unit. Each element of the list is a
sparse matrix of class "dgCMatrix" obtained by stacking unit-
specific covariance blocks along the diagonal. These matrices are used
internally to construct the dispersion linear predictor in
mcglm.
Author(s)
Wagner Hugo Bonat, wbonat@ufpr.br
Source
Bonat, W. H. (2018). Multiple Response Variables Regression Models in R: The mcglm Package. Journal of Statistical Software, 84(4), 1–30.
See Also
mc_id, mc_dglm, mc_dist, mc_ma,
mc_rw, mc_mixed
Pearson Estimating Function
Description
Computes the Pearson estimating function for the dispersion parameters
in multivariate covariance generalized linear models, together with its
associated sensitivity and variability matrices. This function is used
internally in the estimating function framework adopted by
mcglm.
Usage
mc_pearson(
y_vec,
mu_vec,
Cfeatures,
inv_J_beta = NULL,
D = NULL,
correct = FALSE,
compute_sensitivity = TRUE,
compute_variability = FALSE,
W
)
Arguments
y_vec |
Numeric vector of observed responses stacked across response variables. |
mu_vec |
Numeric vector of fitted means corresponding to
|
Cfeatures |
A list containing covariance-related components,
typically including the covariance matrix |
inv_J_beta |
Optional matrix giving the inverse of the sensitivity matrix associated with the regression parameters. Required only when bias correction is requested. |
D |
Optional matrix of derivatives of the mean vector with respect to the regression parameters. Required only when bias correction is requested. |
correct |
Logical indicating whether the bias-corrected Pearson
estimating function should be computed. Defaults to |
compute_sensitivity |
Logical indicating whether the sensitivity
matrix of the Pearson estimating function should be computed.
Defaults to |
compute_variability |
Logical indicating whether the variability
matrix of the Pearson estimating function should be computed.
Defaults to |
W |
Numeric vector or diagonal matrix of weights associated with the observations. |
Details
The Pearson estimating function is based on quadratic forms of the
residuals and the inverse covariance matrix. When
correct = TRUE, a bias-corrected version is computed using the
correction term described in Bonat and Jørgensen (2016). The sensitivity
and variability matrices correspond to Equations (6), (7) and (8) of
that reference.
This function is intended for internal use and is not designed to be called directly by end users.
Value
A list with the following components:
- Score
A numeric vector containing the values of the Pearson estimating function for the dispersion parameters.
- Sensitivity
A matrix giving the sensitivity (expected Jacobian) of the Pearson estimating function. Returned only if
compute_sensitivity = TRUE.- Variability
A matrix giving the variability of the Pearson estimating function. Returned only if
compute_variability = TRUE.- Extra
A list of intermediate quantities used in the computation, mainly products involving derivatives of the covariance matrix.
Author(s)
Wagner Hugo Bonat
Source
Bonat, W. H. and Jørgensen, B. (2016). Multivariate covariance generalized linear models. Journal of the Royal Statistical Society: Series C, 65, 649–675.
Quasi-Score Function
Description
Computes the quasi-score function for the regression parameters in
multivariate covariance generalized linear models, together with its
associated sensitivity and variability matrices. These quantities are
key components of the estimating function approach used to fit
mcglm models.
Usage
mc_quasi_score(D, inv_C, y_vec, mu_vec, W)
Arguments
D |
A numeric matrix corresponding to the derivative of the mean
vector with respect to the regression parameters. This matrix is
typically obtained from the output of
|
inv_C |
A numeric matrix giving the inverse of the covariance
matrix of the response vector, usually obtained from
|
y_vec |
A numeric vector containing the stacked observed responses. |
mu_vec |
A numeric vector containing the stacked fitted mean values. |
W |
A numeric matrix of weights, typically diagonal, accounting for missing observations or differential weighting of the responses. |
Details
Let y denote the response vector, \mu its mean, D the
derivative of \mu with respect to the regression parameters, and
C the covariance matrix of y. The quasi-score is defined as
U_\beta = D^\top C^{-1} W (y - \mu),
where W is a weight matrix. The sensitivity and variability matrices
are computed according to standard estimating function theory and are
used in the iterative fitting algorithm and for inference.
This function is internal and not intended to be called directly by end users.
Value
A list with the following components:
- Score
A numeric vector containing the quasi-score values for the regression parameters.
- Sensitivity
A numeric matrix giving the sensitivity matrix of the quasi-score function.
- Variability
A numeric matrix giving the variability matrix of the quasi-score function.
Author(s)
Wagner Hugo Bonat
Remove Missing Observations from Matrix Linear Predictor
Description
Removes rows and columns corresponding to missing observations from
each component of a matrix linear predictor. This function is typically
applied after completing the data structure (e.g., via
mc_complete_data) and before fitting the model, in
order to ensure compatibility between the response vector and the
covariance-related design matrices.
Usage
mc_remove_na(matrix_pred, cod)
Arguments
matrix_pred |
A list of square matrices representing the components of the matrix linear predictor. |
cod |
An integer vector giving the indices of rows and columns to be removed. These indices usually correspond to missing observations in the stacked response vector. |
Details
For each matrix Z_d in the matrix linear predictor, the function
returns the submatrix obtained by deleting the rows and columns
specified in cod. This operation preserves the symmetry and
relative structure of the covariance components while aligning them
with the reduced response vector.
This is an internal utility function and is not intended to be called directly by end users.
Value
A list of square matrices of the same length as matrix_pred,
where, for each matrix, the rows and columns indexed by cod
have been removed.
Author(s)
Wagner Hugo Bonat, wbonat@ufpr.br
See Also
mc_dglm, mc_ns, mc_ma, mc_rw
Robust Standard Errors for Regression Parameters
Description
Computes cluster-robust (sandwich-type) standard errors for the
regression parameters of an object of class mcglm, accounting
for within-cluster correlation.
Usage
mc_robust_std(object, id)
Arguments
object |
An object of class |
id |
An integer or factor vector identifying clusters or subjects. Its length and ordering must match the number and ordering of the observations used to fit the model. |
Details
The robust variance–covariance matrix is obtained using an empirical estimator based on clustered residuals and the sensitivity matrix of the estimating equations. The implementation assumes that the data are correctly ordered such that observations belonging to the same cluster are stored in contiguous rows.
Value
A list with two components:
- Std.Error
A numeric vector containing the robust standard errors of the regression parameter estimates.
- vcov
A numeric matrix giving the robust variance–covariance matrix of the regression parameter estimates.
The returned objects are computed under the assumption that the data are in the correct cluster order.
Author(s)
Wagner Hugo Bonat, wbonat@ufpr.br
Source
Nuamah, I. F., Qu, Y., and Aminu, S. B. (1996). A SAS macro for stepwise correlated binary regression. Computer Methods and Programs in Biomedicine, 49, 199–210.
See Also
mc_bias_correct_std
Random Walk Model Structure
Description
Constructs the components of the matrix linear predictor associated with random walk (RW) models for longitudinal or time series data. The user may specify the order of the random walk process.
Usage
mc_rw(id, time, data, order = 1, proper = FALSE)
Arguments
id |
A character string giving the name of the column in
|
time |
A character string giving the name of the column in
|
data |
A data frame containing the variables specified in
|
order |
A positive integer specifying the order of the random walk model. |
proper |
Logical indicating whether a proper random walk specification should be used. |
Details
This function builds sparse precision matrix components corresponding
to random walk structures of a given order. It is primarily intended
for longitudinal data indexed by a subject identifier and a time
variable. For pure time series data, the same id value should
be used for all observations. When proper = TRUE, the precision
structure is decomposed into diagonal and off-diagonal components.
Value
If proper = FALSE, a list with a single component:
- Z1
A sparse matrix of class
dgCMatrixrepresenting the random walk precision structure.
If proper = TRUE, a list with two components:
- Z1
A sparse diagonal matrix of class
dgCMatrix.- Z2
A sparse off-diagonal matrix of class
dgCMatrix.
The matrices are ordered consistently with the original data.
Author(s)
Wagner Hugo Bonat, wbonat@ufpr.br
Source
Bonat, W. H. (2018). Multiple Response Variables Regression Models in R: The mcglm Package. Journal of Statistical Software, 84(4), 1–30.
See Also
mc_id, mc_dist, mc_car,
mc_ma, mc_mixed, mc_compute_rho
Examples
id <- rep(1:2, each = 4)
time <- rep(1:4, 2)
data <- data.frame(id = id, time = time)
mc_rw(id = "id", time = "time", data = data, order = 1, proper = FALSE)
mc_rw(id = "id", time = "time", data = data, order = 1, proper = TRUE)
mc_rw(id = "id", time = "time", data = data, order = 2, proper = TRUE)
Matrix product in sandwich form
Description
The function mc_sandwich is just an auxiliar
function to compute product matrix in the sandwich form
bord1 * middle * bord2. An special case appears when
computing the derivative of the covariance matrix with respect to
the power parameter. Always the bord1 and bord2 should be
diagonal matrix. If it is not true, this product is too slow.
Usage
mc_sandwich(middle, bord1, bord2)
mc_sandwich_negative(middle, bord1, bord2)
mc_sandwich_power(middle, bord1, bord2)
mc_sandwich_cholesky(bord1, middle, bord2)
mc_multiply(bord1, bord2)
mc_multiply2(bord1, bord2)
Arguments
middle |
A matrix. |
bord1 |
A matrix. |
bord2 |
A matrix. |
Value
The matrix product bord1 * middle * bord2.
The returned object is intended for internal use only.
Author(s)
Wagner Hugo Bonat
Sensitivity matrix
Description
Compute the sensitivity matrix associated with the Pearson estimating function.
Usage
mc_sensitivity(product, W)
Arguments
product |
A list of matrix. |
W |
Matrix of weights. |
Details
This function implements the equation 7 of Bonat and Jorgensen (2016).
Value
A matrix. The sensitivity matrix associated with the Pearson estimating function. The returned object is intended for internal use only.
Author(s)
Wagner Hugo Bonat and Eduardo Elias Ribeiro Jr
Score Information Criterion for Regression Components
Description
Computes the Score Information Criterion (SIC) for regression
components of a fitted mcglm object. The SIC can be used for
selecting covariates in the linear predictor and supports stepwise
selection procedures.
Usage
mc_sic(object, scope, data, response, penalty = 2, weights)
Arguments
object |
An object of class |
scope |
A character vector with the names of covariates to be tested for inclusion in the linear predictor. |
data |
A data frame containing all variables involved in the model. |
response |
An integer indicating the response variable for which the SIC is computed. |
penalty |
A numeric penalty term applied to the SIC (default is 2). |
weights |
An optional numeric vector of weights used in model fitting. If not provided, unit weights are assumed. |
Details
The SIC is computed using the quasi-score function associated with the
regression parameters. For each candidate covariate in scope,
the method evaluates its contribution via a score-based test statistic
and applies a penalty for model complexity.
Value
A data frame with the following columns:
- SIC
Score Information Criterion value.
- Covariates
Name of the candidate covariate.
- df
Degrees of freedom associated with the test.
- df_total
Total number of regression parameters in the extended model.
- Tu
Score-based test statistic.
- Chisq
Reference chi-squared quantile with 95% confidence level.
References
Bonat, W. H. (2018). Multiple Response Variables Regression Models in R: The mcglm Package. Journal of Statistical Software, 84(4), 1–30.
Bonat, W. H., et al. (2016). Modelling the covariance structure in marginal multivariate count models: Hunting in Bioko Island. Journal of Agricultural, Biological and Environmental Statistics, 22(4), 446–464.
See Also
Examples
set.seed(123)
x1 <- runif(100, -1, 1)
x2 <- gl(2, 50)
beta <- c(5, 0, 3)
X <- model.matrix(~ x1 + x2)
y <- rnorm(100, mean = X %*% beta, sd = 1)
data <- data.frame(y, x1, x2)
Z0 <- mc_id(data)
fit0 <- mcglm(
linear_pred = c(y ~ 1),
matrix_pred = list(Z0),
data = data
)
mc_sic(fit0, scope = c("x1", "x2"), data = data, response = 1)
Score Information Criterion for Covariance Components
Description
Computes the Score Information Criterion (SIC) for covariance
components of a fitted mcglm object. The SIC-covariance is used
to select components of the matrix linear predictor and can be
employed in stepwise selection procedures.
Usage
mc_sic_covariance(object, scope, idx, data, penalty = 2, response, weights)
Arguments
object |
An object of class |
scope |
A list of matrices to be tested for inclusion in the matrix linear predictor. |
idx |
An integer vector indicating which matrices in |
data |
A data frame containing all variables involved in the model. |
penalty |
A numeric penalty term applied to the SIC (default is 2). |
response |
An integer indicating the response variable for which the SIC-covariance is computed. |
weights |
An optional numeric vector of weights used in model fitting. If not provided, unit weights are assumed. |
Details
The SIC-covariance is computed using the Pearson estimating function.
For each group of matrices defined by idx, a score-based test
statistic is calculated to assess the contribution of the associated
covariance components, penalized by model complexity.
Value
A data frame with the following columns:
- SIC
Score Information Criterion value.
- df
Degrees of freedom associated with the test.
- df_total
Total number of covariance parameters in the extended model.
- Tu
Score-based test statistic.
- Chisq
Reference chi-squared quantile with 95% confidence level.
References
Bonat, W. H., et al. (2016). Modelling the covariance structure in marginal multivariate count models: Hunting in Bioko Island. Journal of Agricultural, Biological and Environmental Statistics, 22(4), 446–464.
Bonat, W. H. (2018). Multiple Response Variables Regression Models in R: The mcglm Package. Journal of Statistical Software, 84(4), 1–30.
See Also
Examples
set.seed(123)
SUBJECT <- gl(10, 10)
y <- rnorm(100)
data <- data.frame(y, SUBJECT)
Z0 <- mc_id(data)
Z1 <- mc_mixed(~ 0 + SUBJECT, data = data)
fit0 <- mcglm(
linear_pred = c(y ~ 1),
matrix_pred = list(Z0),
data = data
)
mc_sic_covariance(
fit0,
scope = Z1,
idx = 1,
data = data,
response = 1
)
Auxiliary Function for Block-Diagonal Matrix Construction
Description
Constructs block-diagonal matrices used in the computation of
derivatives of the covariance matrix C. Each input matrix is
placed as the non-zero block corresponding to a given response
variable, while all remaining blocks are set to zero.
Usage
mc_transform_list_bdiag(list_mat, mat_zero, response_number)
Arguments
list_mat |
A list of matrices to be inserted as non-zero blocks. |
mat_zero |
A list of zero matrices defining the block-diagonal structure. This
object is typically obtained from |
response_number |
An integer indicating the response variable position where each
matrix in |
Details
For each matrix in list_mat, a block-diagonal matrix is created
by replacing the block corresponding to response_number in
mat_zero with the given matrix. The output is a list of
block-diagonal matrices with identical structure.
Value
A list of block-diagonal matrices, each corresponding to one element
of list_mat. The returned object is intended for internal use only.
Twin Model Covariance Structures
Description
Constructs the components of the matrix linear predictor for twin data analysis under ACDE-type models. The function generates covariance structures suitable for monozygotic (MZ) and dizygotic (DZ) twins and supports several biologically motivated and flexible model parameterizations.
Usage
mc_twin(id, twin.id, type, replicate = NULL, structure, data)
mc_twin_bio(id, twin.id, type, replicate = NULL, structure, data)
mc_twin_full(id, twin.id, type, replicate, formula, data)
Arguments
id |
A string indicating the name of the column in |
twin.id |
A string indicating the name of the column in |
type |
A string indicating the name of the column in |
replicate |
An optional string indicating the name of the column in |
structure |
A string specifying the covariance structure to be constructed.
Available options are |
data |
A data frame containing all variables referenced by the model. |
formula |
Internal argument used to define flexible and unstructured covariance models. Not intended for direct user specification. |
Details
For biologically motivated structures ("ACE", "ADE",
"AE", "CE", "E"), the function builds covariance
matrices based on classical twin modeling assumptions. For flexible
and unstructured options ("full", "flex", "uns"),
the covariance structure is constructed using matrix linear predictors.
Value
A list of sparse matrices of class dgCMatrix, representing the
components of the matrix linear predictor to be used in the
matrix_pred argument of mcglm.
Author(s)
Wagner Hugo Bonat, wbonat@ufpr.br
Source
Bonat, W. H. (2018). Multiple Response Variables Regression Models in R: The mcglm Package. Journal of Statistical Software, 84(4), 1–30.
See Also
mc_id, mc_dist, mc_car,
mc_rw, mc_ns, mc_dglm,
mc_mixed.
Update Regression Parameters
Description
Updates the regression parameter vectors stored in a list used during the model fitting algorithm. This function is intended for internal use and is called repeatedly to redistribute the regression parameters across response-specific components.
Usage
mc_updateBeta(list_initial, betas, information, n_resp)
Arguments
list_initial |
A list containing the current values of the model parameters. The
element |
betas |
A numeric vector containing the current values of the regression parameters for all response variables. |
information |
A list containing model dimension information, including the number
of regression parameters per response variable. Typically the output
of |
n_resp |
An integer specifying the number of response variables in the model. |
Value
A list with the same structure as list_initial, where the
element regression contains the updated regression parameter
vectors for each response variable. The returned object is intended for internal use only.
Author(s)
Wagner Hugo Bonat
Update Covariance Parameters
Description
Updates the covariance and power parameter vectors stored in a list used during the model fitting algorithm. This function is intended for internal use and distributes covariance parameters across response-specific components and the correlation parameter.
Usage
mc_updateCov(list_initial, covariance, list_power_fixed, information, n_resp)
Arguments
list_initial |
A list containing the current values of the model parameters. The elements
|
covariance |
A numeric vector containing the current values of the covariance parameters for all response variables and, if applicable, the correlation parameter. |
list_power_fixed |
A list of logical values indicating for each response whether the power
parameter should remain fixed ( |
information |
A list containing model dimension information, including the number of
|
n_resp |
An integer specifying the number of response variables in the model. |
Value
A list with the same structure as list_initial, where the elements
tau, power, and rho contain the updated covariance
parameter values for each response variable and the correlation parameter.
The returned object is intended for internal use only.
Author(s)
Wagner Hugo Bonat
Variability Matrix
Description
Computes the variability matrix associated with the Pearson estimating function. This function is intended for internal use in the fitting algorithm and implements Equation 8 from Bonat and Jorgensen (2016).
Usage
mc_variability(sensitivity, product, inv_C, C, res, W)
Arguments
sensitivity |
A numeric matrix representing the sensitivity matrix.
Typically obtained from |
product |
A list of numeric matrices, usually used to compute contributions to the variability matrix. |
inv_C |
A numeric matrix representing the inverse of the covariance
matrix, usually obtained from |
C |
A numeric matrix representing the covariance matrix, usually
obtained from |
res |
A numeric vector of residuals, defined as |
W |
A numeric matrix of weights. |
Details
This function implements Equation 8 from Bonat and Jorgensen (2016), which defines the variability matrix used in generalized estimating equations for multiple response variables.
Value
A symmetric numeric matrix representing the variability matrix associated with the Pearson estimating function. The returned object is intended for internal use only.
Author(s)
Wagner Hugo Bonat
Variance Functions for Generalized Linear Models
Description
Computes the variance function and its derivatives with respect to regression, dispersion, and power parameters. This function supports standard power variance functions as well as binomial responses. Intended primarily for internal use in model fitting.
Usage
mc_variance_function(
mu,
power,
Ntrial,
variance,
inverse,
derivative_power,
derivative_mu
)
mc_power(mu, power, inverse, derivative_power, derivative_mu)
mc_binomialP(mu, power, inverse, Ntrial,
derivative_power, derivative_mu)
mc_binomialPQ(mu, power, inverse, Ntrial,
derivative_power, derivative_mu)
Arguments
mu |
Numeric vector of expected values. Typically obtained from
|
power |
Numeric value (for |
Ntrial |
Positive integer or numeric. Number of trials for binomial response variables. |
variance |
Character string specifying the variance function type:
|
inverse |
Logical. If |
derivative_power |
Logical. If |
derivative_mu |
Logical. If |
Details
The function computes the variance function and its derivatives used in
the estimation of generalized linear models for multiple response variables.
For binomial responses, it accounts for the number of trials and supports
both single (binomialP) and double (binomialPQ) power specifications.
Value
A named list containing one or more of the following elements, depending on the combination of logical arguments:
- V_sqrt
Square root of the variance function.
- V_inv_sqrt
Inverse square root of the variance function.
- D_V_sqrt_power
Derivative of V_sqrt with respect to the power parameter.
- D_V_inv_sqrt_power
Derivative of V_inv_sqrt with respect to the power parameter.
- D_V_sqrt_mu
Derivative of V_sqrt with respect to
mu.- D_V_inv_sqrt_mu
Derivative of V_inv_sqrt with respect to
mu.
Author(s)
Wagner Hugo Bonat
Source
Bonat, W. H. and Jorgensen, B. (2016) Multivariate covariance generalized linear models. Journal of the Royal Statistical Society: Series C (Applied Statistics), 65:649–675.
See Also
Examples
x1 <- seq(-1, 1, length.out = 5)
X <- model.matrix(~x1)
mu <- mc_link_function(beta = c(1, 0.5), X = X, offset = NULL, link = "logit")
mc_variance_function(mu = mu$mu, power = c(2, 1), Ntrial = 1,
variance = "binomialPQ", inverse = FALSE,
derivative_power = TRUE, derivative_mu = TRUE)
Fitting Multivariate Covariance Generalized Linear Models
Description
Fits multivariate covariance generalized linear models. Models are specified through lists defining the linear predictors and matrix linear predictors. The user can choose among different link, variance, and covariance functions. Model fitting is based on an estimating function approach, combining quasi-score functions for regression parameters and Pearson estimating functions for covariance parameters.
Usage
mcglm(linear_pred, matrix_pred, link, variance, covariance,
offset, Ntrial, power_fixed, data, control_initial,
contrasts, weights, control_algorithm)
Arguments
linear_pred |
A list of model formulas, one for each response.
See |
matrix_pred |
A list of known matrices defining the matrix linear
predictor for the covariance structure. See
|
link |
A list of link function names, one for each response.
Possible values are |
variance |
A list of variance function names.
Possible values are |
covariance |
A list of covariance link function names.
Possible values are |
offset |
A list of numeric vectors specifying offsets for each
response. Use |
Ntrial |
A list of numeric vectors specifying the number of trials
for binomial responses. Only used for |
power_fixed |
A list of logical values indicating whether the power
parameter should be fixed ( |
data |
a data frame. |
control_initial |
A list of initial values for the fitting algorithm.
If set to |
contrasts |
An optional list of contrasts passed to
|
weights |
A list of numeric vectors of observation weights.
Each element must have length equal to the number of observations.
Missing observations should be coded as |
control_algorithm |
A list of control parameters passed to the
fitting algorithm. See |
Value
An object of class "mcglm" representing a fitted multivariate
covariance generalized linear model.
The returned object is a list produced by the fitting routine
fit_mcglm, augmented with additional components used by
post–estimation methods. The main components include:
- beta_names
A list of character vectors giving the names of the regression coefficients for each response variable.
- power_fixed
A list of logical values indicating whether the power parameters were fixed or estimated.
- list_initial
A list of initial values used in the fitting algorithm.
- n_obs
An integer giving the number of observations.
- link
A list of link functions used in the model.
- variance
A list of variance functions used in the model.
- covariance
A list of covariance link functions used in the model.
- linear_pred
A list of formulas defining the linear predictors.
- matrix_pred
A list of matrices defining the matrix linear predictors.
- list_X
A list of design matrices corresponding to the linear predictors.
- observed
A matrix of observed response values, with rows corresponding to observations and columns to response variables.
- Ntrial
A list containing the number of trials for each response variable, when applicable.
- offset
A list of offset vectors used in the model.
- sparse
A list of logical values indicating whether each matrix linear predictor is treated as sparse.
- weights
A numeric vector of weights used in the fitting process.
- data
The data frame used to fit the model.
- con
A list of control parameters used by the fitting algorithm.
Additional components may be present for internal use by methods such as
print, summary and predict.
Author(s)
Wagner Hugo Bonat, wbonat@ufpr.br
Source
Bonat, W. H. and Jorgensen, B. (2016) Multivariate covariance generalized linear models. Journal of Royal Statistical Society - Series C 65:649–675.
Bonat, W. H. (2018). Multiple Response Variables Regression Models in R: The mcglm Package. Journal of Statistical Software, 84(4):1–30.
See Also
fit_mcglm, mc_link_function and
mc_variance_function.
Pseudo Akaike Information Criterion
Description
Computes the pseudo Akaike information criterion (pAIC) for fitted multivariate covariance generalized linear models. The pAIC is defined as
pAIC = -2 \, \ell_p + 2 \, \text{df},
where \ell_p is the pseudo log-likelihood and \text{df}
denotes the effective number of parameters in the model.
This criterion is intended for model comparison within the class of
mcglm models fitted to the same data.
Usage
pAIC(object, verbose = TRUE)
Arguments
object |
An object of class |
verbose |
Logical indicating whether the pAIC value should be
printed to the console. Defaults to |
Details
The pAIC is based on the pseudo log-likelihood returned by
plogLik and should be used with caution, as it does not
correspond to a true likelihood-based information criterion.
Comparisons are meaningful only for models fitted to the same response
data.
Value
An (invisible) named list with a single element:
- pAIC
A numeric value giving the pseudo Akaike information criterion associated with the fitted model(s).
Author(s)
Wagner Hugo Bonat, wbonat@ufpr.br
Source
Bonat, W. H. (2018). Multiple Response Variables Regression Models in R: The mcglm Package. Journal of Statistical Software, 84(4), 1–30.
See Also
gof, plogLik, ESS, pKLIC,
GOSHO, RJC
Pseudo Bayesian Information Criterion
Description
Computes the pseudo Bayesian information criterion (pBIC) for fitted multivariate covariance generalized linear models. The pBIC is defined as
pBIC = -2 \, \ell_p + \text{df} \log(n),
where \ell_p is the pseudo log-likelihood, \text{df} is the
effective number of parameters, and n is the total number of
observed responses.
This criterion provides a more strongly penalized alternative to
pAIC, favoring more parsimonious models when comparing
mcglm fits to the same data.
Usage
pBIC(object, verbose = TRUE)
Arguments
object |
An object of class |
verbose |
Logical indicating whether the pBIC value should be
printed to the console. Defaults to |
Details
The sample size n used in the penalty term corresponds to the
total number of observed responses, obtained from the
observed component of the fitted mcglm object(s).
As the pBIC is based on a pseudo log-likelihood, it should be used
cautiously and only for relative comparisons among models fitted to
the same data set.
Value
An (invisible) named list with a single element:
- pBIC
A numeric value giving the pseudo Bayesian information criterion associated with the fitted model(s).
Author(s)
Wagner Hugo Bonat, wbonat@ufpr.br
Source
Bonat, W. H. (2018). Multiple Response Variables Regression Models in R: The mcglm Package. Journal of Statistical Software, 84(4), 1–30.
See Also
gof, plogLik, ESS, pAIC,
pKLIC, GOSHO, RJC
Pseudo Kullback-Leibler Information Criterion
Description
Extract the pseudo Kullback-Leibler information criterion
(pKLIC) for objects of mcglm class.
Usage
pKLIC(object, verbose = TRUE)
Arguments
object |
an object or a list of objects representing a model
of |
verbose |
logical. Print or not the pKLIC value. |
Value
An invisible list with a single numeric component:
- pKLIC
The pseudo Kullback–Leibler Information Criterion computed from the pseudo log-likelihood and a penalty term based on the sensitivity and variability matrices.
Author(s)
Wagner Hugo Bonat, wbonat@ufpr.br
Source
Bonat, W. H. (2018). Multiple Response Variables Regression Models in R: The mcglm Package. Journal of Statistical Software, 84(4):1–30.
See Also
gof, plogLik, ESS, pAIC,
GOSHO and RJC.
Gaussian Pseudo-Loglikelihood
Description
Computes the Gaussian pseudo-loglikelihood for fitted multivariate
covariance generalized linear models. The pseudo-loglikelihood is
obtained by assuming a multivariate normal distribution for the
stacked response vector, using the estimated mean vector and
covariance matrix from the fitted mcglm object.
Usage
plogLik(object, verbose = TRUE)
Arguments
object |
An object of class |
verbose |
Logical indicating whether the pseudo-loglikelihood value
should be printed to the console. Defaults to |
Details
The Gaussian pseudo-loglikelihood is computed as
\ell_p = -\frac{n}{2}\log(2\pi)
- \frac{1}{2}\log|\Sigma|
- \frac{1}{2}(y - \mu)^\top \Sigma^{-1} (y - \mu),
where y is the stacked vector of observed responses, \mu is
the stacked vector of fitted means, and \Sigma is the estimated
covariance matrix. For a list of mcglm objects, block-diagonal
covariance matrices are constructed internally.
This quantity is used mainly for model comparison purposes and as a
building block for pseudo-information criteria such as pAIC
and pBIC. It is not a true likelihood unless the Gaussian
assumption holds.
Value
An invisible list with the following components:
- plogLik
A numeric value giving the Gaussian pseudo-loglikelihood.
- df
An integer giving the total number of estimated parameters (degrees of freedom) used in the model.
Author(s)
Wagner Hugo Bonat, wbonat@ufpr.br
See Also
pAIC, pBIC, ESS, pKLIC, GOSHO,
RJC
Diagnostic Plots for mcglm Objects
Description
Produces diagnostic plots for fitted mcglm objects and returns
the data used to generate the plots. Available diagnostics include:
-
"residuals": plots of fitted values vs. Pearson residuals and Q-Q plots of residuals; -
"algorithm": plots of regression and covariance parameters and quasi-score iterations to inspect algorithm convergence; -
"partial_residuals": partial residual plots for each covariate in the model.
Usage
## S3 method for class 'mcglm'
plot(
x,
type = c("residuals", "algorithm", "partial_residuals"),
plot_graphics = TRUE,
...
)
Arguments
x |
A fitted object of class |
type |
Character string specifying the type of diagnostic plot to produce.
One of |
plot_graphics |
Logical; if |
... |
Additional arguments; currently ignored. Included for compatibility
with the generic |
Value
A list containing the data used to generate the plots:
For
"residuals":fittedvalues andresiduals.For
"algorithm":IterationRegression,IterationCovariance,ScoreRegression,ScoreCovariance.For
"partial_residuals":residuals,list_beta, andpartial_residualscontaining x and y values for each covariate.
Author(s)
Wagner Hugo Bonat, wbonat@ufpr.br
See Also
Examples
library(mcglm)
set.seed(123)
mydata <- data.frame(y = rnorm(10), x1 = rnorm(10),
x2 = rbinom(10, size = 1, prob = 0.5))
Z0 <- mc_id(mydata)
fit <- mcglm(c(y ~ x1 + x2), matrix_pred = list(Z0), data = mydata)
# Produce plots and get data
diag_data <- plot(fit, type = "residuals")
# Only get data without plotting
diag_data <- plot(fit, type = "partial_residuals", plot_graphics = FALSE)
Print Method for mcglm Objects
Description
Prints a concise summary of a fitted mcglm object, including
the model call, link and variance functions, regression coefficients
and dispersion parameters for each response variable.
Usage
## S3 method for class 'mcglm'
print(x, ...)
Arguments
x |
A fitted object of class |
... |
Further arguments passed to or from other methods. |
Value
No return value, called for its side effects.
Author(s)
Wagner Hugo Bonat, wbonat@ufpr.br
See Also
Residuals for mcglm Objects
Description
Computes residuals for a fitted mcglm object. Different types
of residuals can be extracted, depending on the specified argument
type.
Usage
## S3 method for class 'mcglm'
residuals(object, type = c("raw", "pearson", "standardized"), ...)
Arguments
object |
An object of class |
type |
A character string specifying the type of residuals to be returned. Options are:
|
... |
Further arguments passed to or from other methods. Currently ignored. |
Value
A numeric matrix of class Matrix with dimensions
n \times r, where n is the number of observations and
r is the number of response variables.
Author(s)
Wagner Hugo Bonat, wbonat@ufpr.br
See Also
Soil Chemistry Properties Dataset
Description
Soil chemistry properties measured on a regular grid of 10 × 25 points, spaced by 5 meters. Each record contains the chemical composition and coordinates of the soil sample.
Usage
data(soil)
Format
A data.frame with 250 observations and 9 variables:
COORD.XX coordinate of the sampling point.
COORD.YY coordinate of the sampling point.
SANDProportion of sand in the soil sample.
SILTProportion of silt in the soil sample.
CLAYProportion of clay in the soil sample.
PHWATERSoil pH measured in water.
CACalcium content of the soil.
MGMagnesium content of the soil.
KPotassium content of the soil.
Source
Bonat, W. H. (2018). "Multiple Response Variables Regression Models in R: The mcglm Package." Journal of Statistical Software, 84(4):1–30.
Examples
library(mcglm)
data(soil, package = "mcglm")
# Spatial model (tri2nb could be used but takes long)
Z1 <- mc_id(soil)
# Linear predictor example
form.ca <- CA ~ COORD.X*COORD.Y + SAND + SILT + CLAY + PHWATER
fit.ca <- mcglm(
linear_pred = c(form.ca),
matrix_pred = list(Z1),
link = "log",
variance = "tweedie",
covariance = "inverse",
power_fixed = TRUE,
data = soil,
control_algorithm = list(
max_iter = 1000,
tuning = 0.1,
verbose = FALSE,
tol = 1e-03
)
)
Soybeans Experiment Data
Description
Dataset from an experiment conducted in a vegetation house with soybeans.
Each plot contained two plants and the experiment involved three levels of
soil water (water) and five levels of potassium fertilization (pot),
arranged in five blocks (block). Three response variables are recorded:
grain yield, number of seeds, and number of viable peas per plant. The dataset
contains 75 observations and 7 variables.
Usage
data(soya)
Format
A data.frame with 75 observations and 7 variables:
potFactor with five levels of potassium fertilization.
waterFactor with three levels of amount of water in the soil.
blockFactor with five levels representing experimental blocks.
grainContinuous variable representing grain yield per plant.
seedsCount variable representing number of seeds per plant.
viablepeasBinomial variable representing number of viable peas per plant.
totalpeasBinomial variable representing total number of peas per plant.
Source
Bonat, W. H. (2018). Multiple Response Variables Regression Models in R: The mcglm Package. Journal of Statistical Software, 84(4):1–30.
Examples
library(mcglm)
library(Matrix)
data(soya, package = "mcglm")
# Linear predictor example
formu <- grain ~ block + factor(water) * factor(pot)
Z0 <- mc_id(soya)
fit <- mcglm(linear_pred = c(formu), matrix_pred = list(Z0),
data = soya)
anova(fit)
Summary for mcglm Objects
Description
Produces a comprehensive summary of a fitted mcglm object.
The summary includes estimates, standard errors, Wald Z statistics,
and p-values for regression, dispersion, power, and correlation parameters.
The function can either print the summary to the console or return it invisibly.
Usage
## S3 method for class 'mcglm'
summary(
object,
verbose = TRUE,
print = c("Regression", "power", "Dispersion", "Correlation"),
...
)
Arguments
object |
A fitted object of class |
verbose |
Logical; if |
print |
Character vector specifying which components of the summary to print.
Possible values are |
... |
Further arguments passed to or from other methods. Currently ignored. |
Details
The summary also prints information about the model fitting, including the link, variance, and covariance functions used, the algorithm method, any correction applied, and the number of iterations completed.
Value
Invisibly returns a list containing summary tables:
For each response:
Regression,Power,Dispersion.If applicable:
Correlationsummary.
Each table contains parameter estimates, standard errors, Wald Z values, and two-sided p-values.
Author(s)
Wagner Hugo Bonat, wbonat@ufpr.br
See Also
print.mcglm, coef.mcglm, residuals.mcglm
Examples
library(mcglm)
set.seed(123)
mydata <- data.frame(y = rnorm(10), x1 = rnorm(10),
x2 = rbinom(10, size = 1, prob = 0.5))
Z0 <- mc_id(mydata)
fit <- mcglm(c(y ~ x1 + x2), matrix_pred = list(Z0), data = mydata)
# Print full summary
summary(fit)
# Get summary invisibly
out <- summary(fit, verbose = FALSE)
Variance-Covariance Matrix for mcglm Objects
Description
Extracts the variance-covariance matrix of the estimated parameters
from a fitted mcglm object.
Usage
## S3 method for class 'mcglm'
vcov(object, ...)
Arguments
object |
An object of class |
... |
Further arguments passed to or from other methods. Currently ignored. |
Value
A numeric matrix representing the variance-covariance matrix of all estimated model parameters. Row and column names correspond to the parameter identifiers.
Author(s)
Wagner Hugo Bonat, wbonat@ufpr.br