Dimensionality reduction raises a fundamental question: how many components are real?
A scree plot might show a clear “elbow” at 3 components, but is that structure genuine or could it arise from noise? Classical heuristics provide guidance but not formal inference. Permutation testing fills this gap by answering: “What would this statistic look like if there were no true structure?”
The procedure is straightforward: 1. Compute the statistic of interest on the original data (e.g., variance explained by PC1). 2. Shuffle the data to destroy the structure being tested. 3. Recompute the statistic on the shuffled data. 4. Repeat steps 2–3 many times to build a null distribution. 5. Compare the original statistic to this distribution.
If the observed value is more extreme than (say) 95% of the permuted values, we reject the null hypothesis at \(\alpha = 0.05\).
The critical design choice is how to shuffle. Different
analyses require different permutation schemes—shuffling columns for
PCA, shuffling labels for classification, shuffling one block for
cross-block methods. The perm_test() function handles these
details automatically based on the model type.
The interface is consistent across model types: fit your model, then
call perm_test().
Now test whether each principal component captures more variance than expected by chance:
set.seed(1)
pt_pca <- perm_test(mod_pca,
X = X_iris,
nperm = 199,
comps = 3,
parallel = FALSE)
#> Pre-calculating reconstructions for stepwise testing...
#> Running 199 permutations sequentially for up to 3 PCA components (alpha=0.050, serial)...
#> Testing Component 1/3...
#> Testing Component 2/3...
#> Testing Component 3/3...
#> Component 3 p-value (0.055) > alpha (0.050). Stopping sequential testing.The key arguments:
X — the original data (required for re-shuffling)nperm — number of permutations (199 here for speed; use
999+ in practice)comps — how many components to test sequentiallyparallel — enable parallel execution via the
future packageThe result object contains a component_results table
with p-values and confidence intervals:
print(pt_pca$component_results)
#> # A tibble: 3 × 5
#> comp observed pval lower_ci upper_ci
#> <int> <dbl> <dbl> <dbl> <dbl>
#> 1 1 0.925 0.005 0.682 0.689
#> 2 2 0.704 0.01 0.609 0.689
#> 3 3 0.766 0.055 0.668 0.785Components are tested sequentially. By default, testing stops when a
component is non-significant (\(p >
0.05\)), since later components are unlikely to be meaningful if
an earlier one fails. Set alpha = 1 to force testing all
components.
The perm_test() function dispatches to specialized
methods based on the model class. Each method uses an appropriate test
statistic and shuffle scheme.
For PCA, the null hypothesis is that the variables are independent (no correlation structure). The shuffle scheme permutes each column independently, destroying any covariance while preserving marginal distributions.
Statistic: Fraction of remaining variance explained by component \(a\): \[F_a = \frac{\lambda_a}{\sum_{j \ge a}\lambda_j}\]
This tests whether component \(a\) explains more variance than expected given the remaining variance pool. See Vitale et al. (2017) for theoretical background.
For discriminant analysis, the null hypothesis is that class labels are unrelated to the features. The shuffle scheme permutes the class labels.
Statistic: Classification accuracy (using LDA or nearest-centroid).
For two-block methods, the null hypothesis is that blocks X and Y are unrelated. The shuffle scheme permutes rows of one block only, breaking the correspondence while preserving within-block structure.
Statistic: Mean squared error of cross-block prediction (X → Y).
For multi-block analyses, the null hypothesis is that blocks share no common structure. Each block is shuffled independently.
Statistic: Consensus measures (leading eigenvalue or mean absolute correlation between block scores).
The defaults work well for standard analyses, but you can override any component of the permutation machinery.
measure_fun(model_perm, comp_idx, ...)
— Computes the test statistic from a model fitted on permuted data.
Called once per component.
shuffle_fun(data, ...) — Permutes
the data. Must return data in the same structure as the input.
fit_fun(Xtrain, ...) — Re-fits the
model on permuted data. Used by cross_projector and
discriminant_projector when the default fitter (e.g.,
MASS::lda) is not appropriate.
Suppose you want to test whether the first two PCs jointly
explain more variance than chance. You can supply a custom
measure_fun:
my_pca_stat <- function(model_perm, comp_idx, ...) {
# Only compute the joint statistic when testing component 2
if (comp_idx == 2 && length(model_perm$sdev) >= 2) {
sum(model_perm$sdev[1:2]^2) / sum(model_perm$sdev^2)
} else if (comp_idx == 1) {
model_perm$sdev[1]^2 / sum(model_perm$sdev^2)
} else {
NA_real_
}
}
# Illustrative call (using default measure here for simplicity)
pt_pca_custom <- perm_test(mod_pca, X = X_iris, nperm = 50, comps = 2,
parallel = FALSE)
#> Pre-calculating reconstructions for stepwise testing...
#> Running 50 permutations sequentially for up to 2 PCA components (alpha=0.050, serial)...
#> Testing Component 1/2...
#> Testing Component 2/2...
print(pt_pca_custom$component_results)
#> # A tibble: 2 × 5
#> comp observed pval lower_ci upper_ci
#> <int> <dbl> <dbl> <dbl> <dbl>
#> 1 1 0.925 0.0196 0.682 0.692
#> 2 2 0.704 0.0196 0.619 0.690For testing a single combined statistic rather than sequential
components, set comps = 1 and have your
measure_fun compute the combined value directly.
Permutation tests are embarrassingly parallel—each permutation is
independent. The perm_test() methods use the
future framework, so you control parallelism through your
plan:
library(future)
plan(multisession, workers = 4)
pt_pca_parallel <- perm_test(mod_pca, X = X_iris,
nperm = 999,
comps = 3,
parallel = TRUE)
plan(sequential)With 4 workers and 999 permutations, each worker handles ~250 permutations concurrently.
When \(p \gg n\), computing full
eigendecompositions repeatedly can be slow. Keep comps
small (you rarely need to test more than 10–20 components), and consider
faster SVD backends if available.
The default shuffles assume observations are exchangeable under the
null. This breaks down for time-series, spatial data, or blocked
designs. In these cases, provide a custom shuffle_fun that
respects the dependence structure (e.g., block permutation, circular
shifts).
The component_results table includes empirical 95% CIs
derived from the permutation distribution. These quantify uncertainty in
the test statistic under the null.
Most methods need the original data (X, Y,
or Xlist) to perform shuffling. Always pass it
explicitly.
The older perm_ci.pca() is deprecated. Use
perm_test() instead—confidence intervals are included in
the results table.
By providing a unified perm_test interface,
multivarious allows researchers to apply robust,
data-driven significance testing across a range of multivariate modeling
techniques.