% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/calibrate.R
\name{calibrate}
\alias{calibrate}
\title{Calibrate variant effect scores to ACMG/AMP evidence strength}
\usage{
calibrate(df, value = NULL, prior = 0.1, group = NULL, seed = 42)
}
\arguments{
\item{df}{A dataframe. Must have a \code{class} column with values 'P'
(pathogenic) and 'B' (benign) labels, and a numeric column containing the
variant effect scores. At least 10 occurrences of each class are required.}

\item{value}{(optional) A character string indicating the name of the numeric
column in \code{df} with the scores. If not provided, calibration will be
run on all numeric columns.}

\item{prior}{A scalar in the range 0-1 representing the prior probability
of pathogenicity. Default 0.1.}

\item{group}{(optional) A character string indicating the name of the column
with the grouping variable. Default NULL.}

\item{seed}{(optional) A single integer for the random seed. Note that this
argument is only provided for testing/experimental purposes. Users should not
change the default seed if results are to be used or reported.}
}
\value{
A named list of dataframes. When grouping is not provided, the list
has a length of two where 'likelihood_ratios' is the input dataframe with
columns for LR and its confidence bounds (\code{column_name_lr},
\code{column_name_lr_lower} and \code{column_name_upper}). Assigned evidence
classifications can be found in the \code{evidence} column. The second
element in the list is named 'score_thresholds', which contain the lower and
upper bounds of the score interval for ACMG/AMP evidence levels. When
a grouping variable is provided, the returned object is a nested list with a
length equal to the unique group levels in the input data. Each of these
elements contain the 'likelihood_ratios' and 'score_thresholds' dataframes.
}
\description{
The function calculates the positive likelihood ratio (LR, equivalent to the
odds of pathogenicity) based on functional scores, e.g., from MAVEs or
computational predictors, and their truthset labels. Score intervals for
ACMG/AMP evidence levels are also computed. The input data requires at least
one numeric column with the score of interest and another column, named
\code{class}, with at least 10 pathogenic ('P') and 10 benign ('B') labels.
Different or missing labels are allowed, but will be renamed to 'U'.
}
\details{
The function estimates the LR for each input score by resampling Gaussian
kernel density estimates of the pathogenic and benign score distributions.
Densities are mapped using linear interpolation and evaluated on a fixed-size
common grid. To stabilise the LRs in regions where densities approach zero, a
variance-based penalty is computed from log-LRs across 1,000 bootstrap
replicates. This penalty is used to regularise the log-LR matrix. The log-LRs
are monotonised in the principal direction of association with the input
scores. Final estimates for each score include the point estimate and its 95\%
confidence interval. Score intervals for the different ACMG/AMP evidence
levels are interpolated from the grid based upon the confidence bounds.
}
\examples{
# load example data provided with the package
library(acmgscaler)
data(variant_data, package = 'acmgscaler')

# small-scale toy calibration
toy_df <- rbind(
  head(subset(variant_data, class == 'P'), 10),
  head(subset(variant_data, class == 'B'), 10)
)

calibrate(
  df = toy_df,
  value = 'score',
  prior = 0.1
)

# full calibration grouped by gene
\donttest{
calibrate(
  df = variant_data,
  value = 'score',
  group = 'gene',
  prior = 0.1
)
}

}
\references{
Badonyi & Marsh, 2025. acmgscaler: An R package and Colab for
standardised gene-level variant effect score calibration within
the ACMG/AMP framework
\emph{Bioinformatics}.
\doi{10.1093/bioinformatics/btaf503}

Richards et al., 2015. Modeling the ACMG/AMP variant
classification guidelines as a Bayesian classification framework.
\emph{Genetics in Medicine}.
\doi{10.1038/gim.2017.210}

Tavtigian et al., 2018. Standards and guidelines for the
interpretation of sequence variants: a joint consensus recommendation of
the American College of Medical Genetics and Genomics and the Association
for Molecular Pathology.
\emph{Genetics in Medicine}.
\doi{10.1038/gim.2015.30}

Brnich et al., 2019. Recommendations for application of the
functional evidence PS3/BS3 criterion using the ACMG/AMP sequence variant
interpretation framework.
\emph{Genome Medicine}.
\doi{10.1186/s13073-019-0690-2}

Pejaver et al., 2022. Calibration of computational tools for
missense variant pathogenicity classification and ClinGen recommendations
for PP3/BP4 criteria.
\emph{The American Journal of Human Genetics}.
\doi{10.1016/j.ajhg.2022.10.013}

van Loggerenberg et al., 2023. Systematically testing human HMBS
missense variants to reveal mechanism and pathogenic variation
\emph{The American Journal of Human Genetics}.
\doi{10.1016/j.ajhg.2023.08.012}
}
