Helper Functions for Bayesian Kernel Machine Regression

Simulating Multivariate Environmental Exposure Data and Estimating Feature Selection Thresholds

Kazi Tanvir Hasan, Dr. Gabriel Odom, Cristian Guerini, Dr. Zoran Bursac, Dr. Roberto Lucchini, and Dr. Boubakari Ibrahimou

5/14/25

Abstract

In environmental health studies, the relationship between multiple environmental exposures and health outcomes, such as cognitive development, is often investigated using complex datasets that exhibit non-normality. This paper introduces an R package designed to simulate multivariate environmental exposure data and estimate feature selection thresholds to aid in research to apply and extend the Bayesian Kernel Machine Regression (BKMR) methodology. The package enables researchers to generate data from multivariate normal and multivariate skewed Gamma distributions, common in environmental exposure research, and calculate multivariate statistical features such as mean, variance, skewness, shape, and rate vectors, and correlation matrices. This package also facilitates the calculation of a Posterior Inclusion Probability (PIP) threshold for feature selection in BKMR, offering an approach that accounts for non-normal data (based on the forthcoming work of Hasan et al. (2025)). The effectiveness of the package is demonstrated through a real-world application using data from an adolescent environmental exposure study, where metal exposures are related to IQ scores.

Keywords

R package, Bayesian Kernel Machine Regression, multivariate simulation, feature selection, environmental exposure, statistical moments, multivariate Gamma distribution, skewed data

Introduction

Environmental exposure to metals such as Cadmium, Mercury, Arsenic, and Lead is linked to various adverse health outcomes, especially among children (Horton et al. 2013). In particular, these exposures may influence cognitive functions, with effects that often vary based on the concentration of metals in the environment (Sanders, Claus Henn, and Wright 2015; Trasande and Liu 2011; Tyler and Allan 2014; Tolins, Ruchirawat, and Landrigan 2014). Research into these relationships frequently involves complex, multivariate datasets where environmental factors are correlated, and data distributions often exhibit skewness rather than normality (Bobb et al. 2014, 2018; Hasan et al. 2024).

The Bayesian Kernel Machine Regression (BKMR) model is a powerful method for identifying relevant environmental factors and interactions (Bobb et al. 2014, 2018; Wilson et al. 2022; Devick et al. 2022; Gerwinn 2010; Ibrahimou et al. 2024). However, its effectiveness is typically constrained by its assumption of normality in the data, which is frequently violated in environmental health studies (Bobb et al. 2014, 2018; Hasan et al. 2024; Meulen EA 2021). To allow BKMR to flexibly handle multivariate skewed data, ongoing methodological innovation is needed, and synthetic data sets are a critical component of this research. To fill this need for realistic synthetic data sets, we have developed an R package that facilitates parametric bootstrap data simulation for multivariate environmental exposures. Further, this package includes computational results to estimate a Posterior Inclusion Probability (PIP) threshold which enables feature selection for BKMR via a hypothesis testing framework with controlled test size, which is particularly useful when data deviate from multivariate normality (Hasan et al. 2024). In this chapter, we show how to estimate key statistical moments from a real data set which includes effects of metals exposures on children’s IQ scores. Then, to facilitate future methodological innovation, we show how to use these moments as parameter estimates to simulate multiple synthetic environmental exposure data sets from multivariate normal and multivariate (skewed) Gamma distributions. Finally, for the real data analysis application examining exposure to mixture metals and IQ in a cohort of Italian children, we show how to apply our threshold equation and to use BKMR results for feature selection.

Materials and Methods

Functionality in this R Package

The package aims to provide a robust framework for simulating non-normal environmental exposure data and estimating feature selection thresholds in the context of BKMR. The function names were carefully chosen to reflect their purpose and actions within the package. We chose simple, descriptive names to maintain consistency with R’s convention of function names being indicative of their functionality, ensuring that users can quickly grasp their purpose without extensive documentation. The core functions of the package include:

  1. Data Simulation: Functions for generating multivariate data from Normal, Gamma, and skewed Gamma distributions with the simulate_group_data() function; this generates data using multivariate Normal or Gamma distributions.
  2. Statistical Parameters Estimation: The function calculate_stats_gaussian() calculates key statistical moments, including mean, variance, skewness, and correlation matrix, which are essential for understanding the underlying structure of the normally distributed data. This function computes the sample size, gamma distribution parameters (shape and rate), and Spearman correlation matrix for each group, based on the grouping column.
  3. Data Transformation Functions:
  4. Feature Selection Threshold Calculation: The PIP threshold for feature selection is computed using the calculate_pip_threshold() function. This function estimates the threshold based on a Four-Parameter Logistic Regression model (Richard’s curve), which adjusts for variations in data and sample size(RICHARDS 1959). The formula used to calculate the PIP threshold is:

In this equation:

A dynamic threshold function was derived from the logistic model to adjust thresholds based on CV and sample size(Hasan et al. 2024).

Software Implementation

The simBKMRdata software is implemented as an R package. It depends on the following packages: bkmr(Bobb 2022), fields( et al. 2021), base(2024) and MASS(Venables and Ripley 2002). The R software and these required packages can be obtained from the CRAN website. Furthermore, daily builds of package simBKMRdata are provided on the CRAN website [https://cran.r-project.org/web/packages/simBKMRdata/index.html]. It has been published under GPL version 3. The source code is available on GitHub at [https://github.com/khasa006/simBKMRdata].

To install the package from CRAN, use:

install.packages("simBKMRdata")

For GitHub installation of development version (potentially unstable), use:

devtools::install_github("khasa006/simBKMRdata")

Now we load the necessary libraries:

library(tidyverse)
library(simBKMRdata)
library(gt)

Example Analysis of Heavy Metal Exposure

We began by examining the real-world environmental exposure dataset derived from the cohort study led by Lucchini and colleagues. The dataset comprises measurements of five key metal concentrations in children—Cadmium, Mercury and Arsenic from urine, Lead from blood and Manganese from hair, along with corresponding intelligence quotient (IQ) scores, evaluated using the Wechsler Intelligence Scale for Children (WISC-IV), a standardized instrument for assessing children’s Full-Scale IQ (FSIQ)(Wechsler 2003; Orsini, Pezzuti, and Picone 2012; Grizzle 2011; Renzetti et al. 2021).

Data Exploration

Our example pediatric heavy metals exposure data set is included in the package as metalExposChildren_df. We include biological sex as a covariate. The data set includes other variables, but we ignore these for the scope of this work.

data("metalExposChildren_df")

analysisData_df <- metalExposChildren_df %>%
  select(QI, Cadmium:Manganese, Sex)

head(analysisData_df) %>%
  gt() %>%
  tab_header(
    title = "Top Rows of the Data"
  ) %>%
  fmt_number(
    decimals = 2
  ) %>%
  opt_table_outline()
Top Rows of the Data
QI Cadmium Mercury Arsenic Lead Manganese Sex
123.00 0.36 0.03 4.41 6.33 439.74 Female
123.00 1.01 4.12 6.21 10.12 119.55 Female
104.00 0.28 0.62 16.96 13.46 176.16 Female
135.00 0.27 0.53 2.69 2.94 110.49 Male
94.00 0.36 1.03 24.39 6.09 630.43 Male
119.00 0.44 0.03 5.68 8.22 170.04 Female

We show the density plots of these raw exposure levels:

ggplot(
  data = analysisData_df %>%
    select(Cadmium:Manganese) %>%
    pivot_longer(
      cols = everything(), names_to = "Metal", values_to = "Value"
    )
) + 
  aes(x = Value, fill = Metal) + 
  theme_minimal() + 
  theme(legend.position = "none") + 
  labs(
    title = "Histogram with Density Plot for Metals", 
    x = "Concentration", 
    y = "Density"
  ) +
  geom_histogram(
    aes(y = ..density..), bins = 30, alpha = 0.5, position = "identity"
  ) + 
  geom_density(aes(color = Metal), linewidth = 1) + 
  facet_wrap(~Metal, scales = "free")

Figure 1: Histogram with Density Plot for Metals

These exposure profiles are not Normally distributed, as we see in Figure 1. Visualization of the marginal distributions of these metal concentrations using histograms and kernel density plots revealed marked right-skewness across all analytes, suggesting the need for distribution-sensitive modeling approaches.

Parameter Estimation

To capture the empirical distributional features of the observed data, we implemented the calculate_stats_gamma() function to estimate parameters for Gamma distributions within male and female subgroups. This method-of-moments approach yielded group-specific descriptors, including sample sizes, means, Gamma shape and rate parameters, and Spearman correlation matrices among the metals. These parameters served as foundational elements for generating realistic synthetic data that preserve sex-specific distribution and correlation structures of the environmental exposures.

Given the skewness present in the metals data, the Multivariate Gamma distribution is a better fit if we plan to use parametric bootstrap techniques downstream in the analysis. The calculate_stats_gamma() function allows us to calculate group-specific estimators using the group_col argument (in our case, for Males and Females), and it has options with the using argument to estimate parameters via Method of Moments ("MoM") and generalized Maximum Likelihood Estimation ("gMLE")(Weiss and Wolfowitz 1966; El Adlouni et al. 2007). For this example, we calculate the Method of Moments parameter estimates of this estimated distribution via "MoM".

param_list <- calculate_stats_gamma(
  data_df = analysisData_df, 
  group_col = "Sex", 
  using = "MoM"
)

# Examine list contents
str(
  param_list,
  max.level = 2,
  give.attr = FALSE
)
List of 2
 $ Female:List of 5
  ..$ sampSize    : int 242
  ..$ mean_vec    : Named num [1:6] 101.893 0.554 0.701 24.566 10.439 ...
  ..$ sampCorr_mat: num [1:6, 1:6] 1 -0.0942 -0.0625 -0.114 -0.019 ...
  ..$ shape_num   : Named num [1:6] 44.706 0.343 0.242 0.897 1.033 ...
  ..$ rate_num    : Named num [1:6] 0.4388 0.6185 0.3444 0.0365 0.0989 ...
 $ Male  :List of 5
  ..$ sampSize    : int 186
  ..$ mean_vec    : Named num [1:6] 101.108 0.506 0.723 25.67 11.042 ...
  ..$ sampCorr_mat: num [1:6, 1:6] 1 -0.08159 -0.00122 -0.00365 0.16943 ...
  ..$ shape_num   : Named num [1:6] 46.1 0.757 0.282 0.174 0.544 ...
  ..$ rate_num    : Named num [1:6] 0.45595 1.49634 0.38964 0.00679 0.04928 ...

Data Simulation

With the estimated parameters, we used the simulate_group_data() function to generate synthetic multivariate Gamma-distributed exposure data and preserve the distributional properties of the original data. The simulated data maintained inter-variable correlations and group-specific skewness patterns, effectively replicating real-world exposure profiles. Based on the Method of Moments estimates that were calculated, we can draw a parametric bootstrap sample by passing in the list of group-specific parameter estimates found above:

gamma_samples <- simulate_group_data(
  param_list = param_list, 
  data_gen_fn = generate_mvGamma_data, 
  group_col_name = "Sex"
)

head(gamma_samples) %>% 
  gt() %>%
  tab_header(
    title = "Top Rows of the gamma Sample"
  ) %>%
  fmt_number(
    decimals = 2
  ) %>%
  opt_table_outline()
Top Rows of the gamma Sample
V1 V2 V3 V4 V5 V6 Sex
113.07 0.18 0.06 22.13 7.95 1,970.83 Female
88.04 0.03 0.08 21.92 3.51 1,025.22 Female
104.84 5.61 1.29 12.82 6.09 105.75 Female
113.67 0.01 0.06 19.65 9.84 635.99 Female
118.07 0.00 0.00 0.78 0.95 642.75 Female
97.72 0.05 0.02 6.14 0.68 836.97 Female

We account for the grouping variable column name with group_col_name, and we specify the Multivariate Gamma distribution by invoking the helper function generate_mvGamma_data(). This function can be repeatedly invoked to create hundreds of parametric bootstrap resamples to test future methodological developments. These simulation results are extensively discussed in Hasan et al. (2024) and Hasan et al. (2025).

Dynamic Threshold Calculation

As with any hypothesis-testing paradigm, we must set the threshold of rejection before we perform an experiment. For variable selection with BKMR, this is no different. In previous studies(Ibrahimou et al. 2024; Pesenti et al. 2023; Li et al. 2022; Zhang et al. 2019; Coker et al. 2018; Barbieri and Berger 2004; Zheng, Luo, and Xue 2024), a static threshold 50% was used to delineate which exposures were statistically related to the outcome. However, as shown in Hasan et al. (2024) and Hasan et al. (2025), using a static threshold often results in either highly conservative (underpowered) tests, or tests that have a poorly controlled α size (greater than the expected 5%).

To address the challenge of determining meaningful PIP thresholds under skewed data conditions, we applied the calculate_pip_threshold() function. This function dynamically estimates a threshold based on the dataset’s coefficient of variation (CV) and sample size using an estimate for the Four-Parameter Logistic Regression model (Richard’s Curve) described above.

pipThresh_num <- calculate_pip_threshold(
  y = na.omit(analysisData_df$QI)
)
pipThresh_num
[1] 0.11728

Therefore, based on this response vector’s distribution, we have shown that a threshold of approximately 0.117 should yield a better controlled, more powerful test than using a static threshold of 0.5.

Bayesian Kernel Machine Regression Analysis

To assess the relationship between metal exposures and cognitive outcomes, we will need to execute BKMR on the real data. To assess the relationship between metal exposures and cognitive outcomes, we employed BKMR using the kmbayes() function from the bkmr package. Prior to analysis, we omit the missing values and the metal concentration variables were log-transformed using the trans_log() function to mitigate skewness. Because we want to transform by log10(x+1), we will use the function trans_log() which uses base 10 and a one-unit positive shift by default. The exposure matrix thus comprised log-transformed values of the five metals, with IQ as the continuous outcome variable.

bkmrAnalysisData_df <- 
  analysisData_df %>%
  drop_na() %>%
  mutate_at(
    vars(Cadmium:Manganese), ~ trans_log(.) %>% as.vector
  )

The BKMR package expects the response as a vector and the exposure as a data.matrix() object. Covariates must be numeric, so we will recode Sex as a binary variable. To enhance computational efficiency, we select a matrix of knots that span the exposure space, allowing the Gaussian process to be approximated using a reduced set of representative points.

set.seed(2025)
expos <- bkmrAnalysisData_df %>%
  select(Cadmium:Manganese) %>% 
  data.matrix()
is_female <- (bkmrAnalysisData_df$Sex == "Female") + 0L

# Generate knot points using a cover design for Bayesian modeling
knots50 <- fields::cover.design(expos, nd = 50)$design

Now that we have set up our data and search grid, we can run BKMR using MCMC. The BKMR model was fit with 2,000 MCMC iterations, allowing for variable selection via computation of Posterior Inclusion Probabilities (PIPs).

library(bkmr)

set.seed(2025)

# Fit the BKMR model using MCMC
modelFit <- kmbayes(
  # Response variable
  y = bkmrAnalysisData_df$QI,
  # Exposure matrix (metal concentrations)
  Z = expos,     
  # Sex at birth covariate
  X = is_female,   
  # Number of MCMC iterations; set to at least 10000 for real analysis
  iter = 2000,         
  family = "gaussian", # Gaussian response
  verbose = FALSE,     # Output progress for each iteration
  varsel = TRUE,       # Perform variable selection
  knots = knots50      # Knot points generated earlier
)

To further explore the nature of these associations, we examined the exposure-response relationships between each metal and IQ using univariate predictor-response functions. The PredictorResponseUnivar() function generated marginal effects, where the value of each metal varied while keeping the remaining exposures fixed at the 75th percentile.

# Generate univariate predictor-response relationships
predRespUnivar <- PredictorResponseUnivar(fit = modelFit)

# Plot univariate predictor-response functions
predRespUnivarPlot <- ggplot(
  predRespUnivar, 
  aes(
    z, 
    est, 
    ymin = est - 1.96 * se, 
    ymax = est + 1.96 * se
  )
) + 
  geom_smooth(stat = "identity") +  # Add smooth lines with confidence intervals
  facet_wrap(~ variable) +          # Create separate plots for each variable
  ylab("h(z)")                      # Label the y-axis

Results

We want to know which metals are related to cognitive ability, as measured by IQ. To extraction the PIPs from the modelFit model object, we use the ExtractPIPs() function.

Table 1: Posterior Inclusion Probabilities (PIPs) for Metal Exposures
Posterior Inclusion Probabilities (PIPs)
variable PIP
Cadmium 0.993
Lead 0.611
Manganese 0.307
Mercury 0.276
Arsenic 0.151

Notably, Cadmium and Lead emerged as the strongest contributors to cognitive development outcomes, as indicated by elevated PIP values (Table 1) substantially above the baseline. Cadmium demonstrated a near-certain inclusion probability (PIP = 0.993), suggesting a strong association with IQ outcomes, while Lead also displayed a high PIP (0.611), reinforcing previous epidemiological findings on its neurotoxic effects(Bakulski et al. 2020; Sanders, Claus Henn, and Wright 2015; Henn et al. 2012; Ciesielski et al. 2012).

However, proper feature selection in the BKMR framework requires a calibrated, data-driven threshold rather than a fixed conventional cutoff (e.g., 0.5). Using the calculate_pip_threshold() function, we computed a dynamic PIP threshold of 0.117. This threshold adapts to sample size and characteristics and adjusts the selection criterion accordingly, making it especially suitable for non-normal data distributions.

By comparing each metal’s PIP against the calculated threshold, we identified a refined subset of significant exposures, which included not only cadmium and lead but also manganese, mercury and arsenic. This approach allows for a statistically principled method of feature selection within the BKMR framework, ensuring robust and interpretable results even when the data deviate from normality.

Figure 2 shows the relationship of each individual metals with the outcome, where all of the other exposures are fixed to a particular quantile (e.g., 75th percentile). The results indicate non-linear associations for several metals. For instance, Cadmium exhibited a consistently negative association with IQ across a wide range of concentrations, whereas the relationship for Lead suggested a decline at higher concentrations. Associations for Manganese, Mercury, and Arsenic were comparatively weaker but statistically distinguishable from zero, consistent with their PIP values exceeding the dynamic threshold. The consistent patterns identified between PIP-based feature selection and univariate exposure-response functions validate the efficacy of the dynamic threshold method for BKMR feature selection in practical, non-normal environmental datasets.

# Plot univariate predictor-response functions
predRespUnivarPlot <- ggplot(
  predRespUnivar, 
  aes(
    z, 
    est, 
    ymin = est - 1.96 * se, 
    ymax = est + 1.96 * se
  )
) + 
  geom_smooth(stat = "identity") +  # Add smooth lines with confidence intervals
  facet_wrap(~ variable) +          # Create separate plots for each variable
  ylab("h(z)")                      # Label the y-axis

predRespUnivarPlot

Figure 2: Univariate exposure-response function plot

Discussion

The R package we developed is a valuable tool for simulating multivariate environmental exposure data and estimating feature selection thresholds for BKMR. By supporting multivariate normal and multivariate skewed Gamma distributions, the package is particularly well-suited for handling non-normal environmental exposure data, which is common in health studies.

Implications of Pediatric Heavy Metal Exposure

In addition to describing the software package, we present an analysis of an anonymous but original real-world dataset (which includes measurements of metal exposure and children’s IQ scores). Using BKMR within this framework allowed us to rigorously assess the relationships between multiple environmental exposures and cognitive development. The analysis of the original dataset adds to the growing body of literature on the impacts of heavy metal exposure on cognitive outcomes(Porru et al. 2024; Karri, Schuhmacher, and Kumar 2016; Rafiee et al. 2020). However, supporting literature is sparse when applied to pediatric exposure cases(Ouyang et al. 2023; Saxena et al. 2022; Kim et al. 2023; Bora et al. 2019), so this analysis offers a valuable corroborating perspective.

The results highlighted cadmium, and lead as significant contributors to cognitive decline—findings well supported by prior epidemiological research highlighting their neurotoxic effects (Bakulski et al. 2020; Sanders, Claus Henn, and Wright 2015; Henn et al. 2012; Ciesielski et al. 2012). In addition, the dynamic, data-driven PIP threshold, included additional metals - manganese, mercury and arsenic, which showed strong evidence for neurotoxicity(Balachandran et al. 2020; Levin-Schwartz et al. 2021; Oliveira et al. 2018; Grandjean and Landrigan 2006; Tian et al. 2025; Rosado et al. 2007). Notably, if we relied on a static threshold (e.g., PIP > 0.5), these additional exposures would have been incorrectly excluded, potentially underestimating the true burden of environmental neurotoxicity. The univariate exposure-response plots further corroborated these associations, providing visualization of dose-response relationships for each metal with IQ outcomes. These results underscore the advantage of the dynamic threshold approach: it enhances the sensitivity of feature selection while maintaining control over false positive rates, thereby improving the robustness and interpretability of BKMR analyses in complex, real-world exposure settings.

Utility in Future Methodological Research

In this analysis, we show how to simulate multiple datasets to assess the robustness of new methodological findings under various conditions of environmental exposure. The purpose of generating these datasets is to test the applicability of the BKMR model and feature selection process in the presence of non-normal (skewed) data. By simulating multiple datasets from both multivariate normal and skewed Gamma distributions, we can compare how the model performs across different data distributions. This approach ensures that the conclusions drawn from a target (real-world) dataset are robust and generalizable, not merely an artifact of that specific dataset. By simulating multiple datasets and applying the same analysis framework, we test the reproducibility of the results and gain confidence in the model’s ability to handle different data conditions.

We showed how this package can be used to generate multiple new simulated datasets which allow us to explore the model’s performance under controlled, synthetic conditions that mimic real-world skewed exposure data. We could then analyze the multiple generated datasets using the same statistical and modeling procedures we would plan to apply to any new data. Specifically, we calculate the statistical moments for each simulated dataset, then use those to generate exposure data. The results from the multiple datasets can then be summarized by comparing the feature selection outcomes (PIPs) across different simulations. These PIPs would help identify which metals have the most significant association with cognitive outcomes. We could aggregate these findings and provide a summary of how consistent or variable the results are across different simulations, strengthening the validity of newly developed modeling algorithms.

Conclusion

This R package provides essential tools for simulating multivariate environmental exposure data, estimating statistical moments, and calculating PIP thresholds for feature selection in Bayesian Kernel Machine Regression. While BKMR is an excellent tool for feature selection, its reliance on normality can limit its application to real-world datasets. The ability to simulate data from multivariate normal and skewed Gamma distributions makes the package particularly valuable for environmental health studies where data often deviate from normality. Our package fills this gap by enabling researchers to model and analyze non-normally distributed data, ensuring that feature selection remains robust even in complex scenarios.

Bakulski, Kelly M., Young Ah Seo, Ruby C. Hickman, Daniel Brandt, Harita S. Vadari, Howard Hu, and Sung Kyun Park. 2020. “Heavy Metals Exposure and Alzheimers Disease and Related Dementias.” Journal of Alzheimer’s Disease 76 (4): 1215–42. https://doi.org/10.3233/jad-200282.
Balachandran, Rekha C., Somshuvra Mukhopadhyay, Danielle McBride, Jennifer Veevers, Fiona E. Harrison, Michael Aschner, Erin N. Haynes, and Aaron B. Bowman. 2020. “Brain Manganese and the Balance Between Essential Roles and Neurotoxicity.” Journal of Biological Chemistry 295 (19): 6312–29. https://doi.org/10.1074/jbc.rev119.009453.
Barbieri, Maria Maddalena, and James O. Berger. 2004. “Optimal Predictive Model Selection.” The Annals of Statistics 32 (3). https://doi.org/10.1214/009053604000000238.
Bobb, Jennifer F. 2022. “Bkmr: Bayesian Kernel Machine Regression.” https://CRAN.R-project.org/package=bkmr.
Bobb, Jennifer F., Birgit Claus Henn, Linda Valeri, and Brent A. Coull. 2018. “Statistical Software for Analyzing the Health Effects of Multiple Concurrent Exposures via Bayesian Kernel Machine Regression.” Environmental Health 17 (1). https://doi.org/10.1186/s12940-018-0413-y.
Bobb, Jennifer F., Linda Valeri, Birgit Claus Henn, David C. Christiani, Robert O. Wright, Maitreyi Mazumdar, John J. Godleski, and Brent A. Coull. 2014. “Bayesian Kernel Machine Regression for Estimating the Health Effects of Multi-Pollutant Mixtures.” Biostatistics 16 (3): 493–508. https://doi.org/10.1093/biostatistics/kxu058.
Bora, Béatrice Koba, Ana Luiza Ramos-Crawford, Alla Sikorskii, Michael Joseph Boivin, Didier Malamba Lez, Dieudonné Mumba-Ngoyi, Abdon Mukalay Wa Mukalay, Daniel Okitundu-Luwa, and Desiré Tshala-Katumbay. 2019. “Concurrent Exposure to Heavy Metals and Cognition in School-Age Children in Congo-Kinshasa: A Complex Overdue Research Agenda.” Brain Research Bulletin 145 (February): 81–86. https://doi.org/10.1016/j.brainresbull.2018.06.013.
Ciesielski, Timothy, Jennifer Weuve, David C. Bellinger, Joel Schwartz, Bruce Lanphear, and Robert O. Wright. 2012. “Cadmium Exposure and Neurodevelopmental Outcomes in U.S. Children.” Environmental Health Perspectives 120 (5): 758–63. https://doi.org/10.1289/ehp.1104152.
Coker, Eric, Jonathan Chevrier, Stephen Rauch, Asa Bradman, Muvhulawa Obida, Madelein Crause, Riana Bornman, and Brenda Eskenazi. 2018. “Association Between Prenatal Exposure to Multiple Insecticides and Child Body Weight and Body Composition in the VHEMBE South African Birth Cohort.” Environment International 113 (April): 122–32. https://doi.org/10.1016/j.envint.2018.01.016.
Devick, Katrina L., Jennifer F. Bobb, Maitreyi Mazumdar, Birgit Claus Henn, David C. Bellinger, David C. Christiani, Robert O. Wright, Paige L. Williams, Brent A. Coull, and Linda Valeri. 2022. “Bayesian Kernel Machine Regression-Causal Mediation Analysis.” Statistics in Medicine 41 (5): 860–76. https://doi.org/10.1002/sim.9255.
Douglas Nychka, Reinhard Furrer, John Paige, and Stephan Sain. 2021. “Fields: Tools for Spatial Data.” https://github.com/dnychka/fieldsRPackage.
El Adlouni, S., T. B. M. J. Ouarda, X. Zhang, R. Roy, and B. Bobée. 2007. “Generalized Maximum Likelihood Estimators for the Nonstationary Generalized Extreme Value Model.” Water Resources Research 43 (3). https://doi.org/10.1029/2005wr004545.
Gerwinn, Sebastian. 2010. “Bayesian Inference for Generalized Linear Models for Spiking Neurons.” Frontiers in Computational Neuroscience 4. https://doi.org/10.3389/fncom.2010.00012.
Grandjean, P, and PJ Landrigan. 2006. “Developmental Neurotoxicity of Industrial Chemicals.” The Lancet 368 (9553): 2167–78. https://doi.org/10.1016/s0140-6736(06)69665-7.
Grizzle, Renee. 2011. “Wechsler Intelligence Scale for Children, Fourth Edition.” In, 1553–55. Springer US. https://doi.org/10.1007/978-0-387-79061-9_3066.
Hasan, Kazi Tanvir, Gabriel Odom, Zoran Bursac, and Boubakari Ibrahimou. 2024. “The Sensitivity of Bayesian Kernel Machine Regression (BKMR) to Data Distribution: A Comprehensive Simulation Analysis.” https://doi.org/10.48550/ARXIV.2411.00286.
Hasan, Kazi Tanvir, Gabriel Odom, Zoran Bursac, Roberto Lucchini, and Boubakari Ibrahimou. 2025. “Enhancing Bayesian Kernel Machine Regression: A Dynamic Thresholding Framework to Address Variability and Skewness in High-Dimensional Environmental Health Data.” http://dx.doi.org/10.1101/2025.04.14.25325822.
Henn, Birgit Claus, Lourdes Schnaas, Adrienne S. Ettinger, Joel Schwartz, Héctor Lamadrid-Figueroa, Mauricio Hernández-Avila, Chitra Amarasiriwardena, et al. 2012. “Associations of Early Childhood Manganese and Lead Coexposure with Neurodevelopment.” Environmental Health Perspectives 120 (1): 126–31. https://doi.org/10.1289/ehp.1003300.
Horton, Lindsey M., Mary E. Mortensen, Yulia Iossifova, Marlena M. Wald, and Paula Burgess. 2013. “What Do We Know of Childhood Exposures to Metals (Arsenic, Cadmium, Lead, and Mercury) in Emerging Market Countries?” International Journal of Pediatrics 2013: 1–13. https://doi.org/10.1155/2013/872596.
Ibrahimou, Boubakari, Kazi Tanvir Hasan, Shelbie Burchfield, Hamisu Salihu, Yiliang Zhu, Getachew Dagne, Mario De La Rosa, Assefa Melesse, Roberto Lucchini, and Zoran Bursac. 2024. “Assessing the Risk of Heart Attack: A Bayesian Kernel Machine Regression Analysis of Heavy Metal Mixtures.” http://dx.doi.org/10.21203/rs.3.rs-4456611/v1.
Karri, Venkatanaidu, Marta Schuhmacher, and Vikas Kumar. 2016. “Heavy Metals (Pb, Cd, As and MeHg) as Risk Factors for Cognitive Dysfunction: A General Review of Metal Mixture Mechanism in Brain.” Environmental Toxicology and Pharmacology 48 (December): 203–13. https://doi.org/10.1016/j.etap.2016.09.016.
Kim, Minkeun, Chulyong Park, Joon Sakong, Shinhee Ye, So young Son, and Kiook Baek. 2023. “Association of Heavy Metal Complex Exposure and Neurobehavioral Function of Children.” Annals of Occupational and Environmental Medicine 35 (1). https://doi.org/10.35371/aoem.2023.35.e23.
Levin-Schwartz, Yuri, Birgit Claus Henn, Chris Gennings, Brent A. Coull, Donatella Placidi, Megan K. Horton, Donald R. Smith, Roberto G. Lucchini, and Robert O. Wright. 2021. “Integrated Measures of Lead and Manganese Exposure Improve Estimation of Their Joint Effects on Cognition in Italian School-Age Children.” Environment International 146 (January): 106312. https://doi.org/10.1016/j.envint.2020.106312.
Li, Haomin, Wenying Deng, Raphael Small, Joel Schwartz, Jeremiah Liu, and Liuhua Shi. 2022. “Health Effects of Air Pollutant Mixtures on Overall Mortality Among the Elderly Population Using Bayesian Kernel Machine Regression (BKMR).” Chemosphere 286 (January): 131566. https://doi.org/10.1016/j.chemosphere.2021.131566.
Meulen EA, van der Meulen PJ ( van der, Raymond K. 2021. “Consistent Confidence Limits, P Values, and Power of the Non-Conservative, Size – α Modified Fisher Exact Test.” Journal of Biostatistics and Biometric Applications 6 (1): 102. https://www.annexpublishers.com/articles/JBIA/6102-Consistent-Confidence-Limits-P-Values.pdf.
Oliveira, Cláudia S., Pablo A. Nogara, Daniel M. P. Ardisson-Araújo, Michael Aschner, João B.T. Rocha, and José G. Dórea. 2018. “Neurodevelopmental Effects of Mercury.” In, 27–86. Elsevier. https://doi.org/10.1016/bs.ant.2018.03.005.
Orsini, A., L. Pezzuti, and L. Picone. 2012. “Wechsler Intelligence Scale for Childrenfourth Edition; Italian Adaptation.” American Psychological Association (APA). https://doi.org/10.1037/t74943-000.
Ouyang, Lu, Qi Li, Shaoqi Rao, Rui Su, Yanhui Zhu, Guihua Du, Jie Xie, Fankun Zhou, Chang Feng, and Guangqin Fan. 2023. “Cognitive Outcomes Caused by Low-Level Lead, Cadmium, and Mercury Mixture Exposure at Distinct Phases of Brain Development.” Food and Chemical Toxicology 175 (May): 113707. https://doi.org/10.1016/j.fct.2023.113707.
Pesenti, Nicola, Piero Quatto, Elena Colicino, Raffaella Cancello, Massimo Scacchi, and Antonella Zambon. 2023. “Comparative Efficacy of Three Bayesian Variable Selection Methods in the Context of Weight Loss in Obese Women.” Frontiers in Nutrition 10 (July). https://doi.org/10.3389/fnut.2023.1203925.
Porru, Simona, Ana Esplugues, Sabrina Llop, and Juana María Delgado-Saborit. 2024. “The Effects of Heavy Metal Exposure on Brain and Gut Microbiota: A Systematic Review of Animal Studies.” Environmental Pollution 348 (May): 123732. https://doi.org/10.1016/j.envpol.2024.123732.
R Core Team. 2024. “R: A Language and Environment for Statistical Computing.” https://www.R-project.org/.
Rafiee, Ata, Juana Maria Delgado-Saborit, Peter D. Sly, Bernadette Quémerais, Fallah Hashemi, Sadaf Akbari, and Mohammad Hoseini. 2020. “Environmental Chronic Exposure to Metals and Effects on Attention and Executive Function in the General Population.” Science of The Total Environment 705 (February): 135911. https://doi.org/10.1016/j.scitotenv.2019.135911.
Renzetti, Stefano, Giuseppa Cagna, Stefano Calza, Michele Conversano, Chiara Fedrighi, Giovanni Forte, Augusto Giorgino, et al. 2021. “The Effects of the Exposure to Neurotoxic Elements on Italian Schoolchildren Behavior.” Scientific Reports 11 (1). https://doi.org/10.1038/s41598-021-88969-z.
RICHARDS, F. J. 1959. “A Flexible Growth Function for Empirical Use.” Journal of Experimental Botany 10 (2): 290–301. https://doi.org/10.1093/jxb/10.2.290.
Rosado, Jorge L., Dolores Ronquillo, Katarzyna Kordas, Olga Rojas, Javier Alatorre, Patricia Lopez, Gonzalo Garcia-Vargas, María del Carmen Caamaño, Mariano E. Cebrián, and Rebecca J. Stoltzfus. 2007. “Arsenic Exposure and Cognitive Performance in Mexican Schoolchildren.” Environmental Health Perspectives 115 (9): 1371–75. https://doi.org/10.1289/ehp.9961.
Sanders, Alison P., Birgit Claus Henn, and Robert O. Wright. 2015. “Perinatal and Childhood Exposure to Cadmium, Manganese, and Metal Mixtures and Effects on Cognition and Behavior: A Review of Recent Literature.” Current Environmental Health Reports 2 (3): 284–94. https://doi.org/10.1007/s40572-015-0058-8.
Saxena, Roheeni, Mary Gamble, Gail A. Wasserman, Xinhua Liu, Faruque Parvez, Ana Navas-Acien, Tariqul Islam, et al. 2022. “Mixed Metals Exposure and Cognitive Function in Bangladeshi Adolescents.” Ecotoxicology and Environmental Safety 232 (March): 113229. https://doi.org/10.1016/j.ecoenv.2022.113229.
Tian, Yumei, Qi Hou, Mingyue Zhang, Er Gao, and Yue Wu. 2025. “Exposure to Arsenic and Cognitive Impairment in Children: A Systematic Review.” Edited by Antonio Peña-Fernández. PLOS ONE 20 (2): e0319104. https://doi.org/10.1371/journal.pone.0319104.
Tolins, Molly, Mathuros Ruchirawat, and Philip Landrigan. 2014. “The Developmental Neurotoxicity of Arsenic: Cognitive and Behavioral Consequences of Early Life Exposure.” Annals of Global Health 80 (4): 303. https://doi.org/10.1016/j.aogh.2014.09.005.
Trasande, Leonardo, and Yinghua Liu. 2011. “Reducing The Staggering Costs Of Environmental Disease In Children, Estimated At $76.6 Billion In 2008.” Health Affairs 30 (5): 863–70. https://doi.org/10.1377/hlthaff.2010.1239.
Tyler, Christina R., and Andrea M. Allan. 2014. “The Effects of Arsenic Exposure on Neurological and Cognitive Dysfunction in Human and Rodent Studies: A Review.” Current Environmental Health Reports 1 (2): 132–47. https://doi.org/10.1007/s40572-014-0012-1.
Venables, W. N., and B. D. Ripley. 2002. “Modern Applied Statistics with s.” https://www.stats.ox.ac.uk/pub/MASS4/.
Wechsler, David. 2003. “Wechsler Intelligence Scale for Children, Fourth Edition.” American Psychological Association (APA). https://doi.org/10.1037/t15174-000.
Weiss, L., and J. Wolfowitz. 1966. “Generalized Maximum Likelihood Estimators.” Theory of Probability & Its Applications 11 (1): 58–81. https://doi.org/10.1137/1111003.
Wilson, Ander, Hsiao-Hsien Leon Hsu, Yueh-Hsiu Mathilda Chiu, Robert O. Wright, Rosalind J. Wright, and Brent A. Coull. 2022. “Kernel Machine and Distributed Lag Models for Assessing Windows of Susceptibility to Environmental Mixtures in Childrens Health Studies.” The Annals of Applied Statistics 16 (2). https://doi.org/10.1214/21-aoas1533.
Zhang, Yuqing, Tianyu Dong, Weiyue Hu, Xu Wang, Bo Xu, Zhongning Lin, Tim Hofer, et al. 2019. “Association Between Exposure to a Mixture of Phenols, Pesticides, and Phthalates and Obesity: Comparison of Three Statistical Models.” Environment International 123 (February): 325–36. https://doi.org/10.1016/j.envint.2018.11.076.
Zheng, Zitian, Huanhuan Luo, and Qingyun Xue. 2024. “The Association of Urinary Heavy Metal Exposure with Frailty Susceptibility and Mortality in Middle-Aged and Older Adults: A Population-Based Study.” Archives of Public Health 82 (1). https://doi.org/10.1186/s13690-024-01275-8.