---
title: SVD wrapper, PCA and the bi_projector
output: rmarkdown::html_vignette
vignette: >
%\VignetteIndexEntry{SVD wrapper, PCA and the bi_projector}
%\VignetteEngine{knitr::rmarkdown}
%\VignetteEncoding{UTF-8}
params:
family: red
css: albers.css
resource_files:
- albers.css
- albers.js
includes:
in_header: |-
---
```{r setup, include=FALSE}
knitr::opts_chunk$set(collapse = TRUE, comment = "#>", fig.width = 6, fig.height = 4)
library(multivarious) # Ensure pca(), svd_wrapper(), bi_projector, etc. are available
library(ggplot2) # for plots below
```
# 1. Why wrap SVD at all?
There are six popular SVD engines in R (base::svd, corpcor, RSpectra,
irlba, rsvd, svd (PROPACK)) – each with its own argument list,
naming conventions and edge-cases (some refuse to return the full rank,
others crash on tall-skinny matrices).
`svd_wrapper()` smooths that out:
* identical call-signature no matter the backend,
* automatic pre-processing (centre / standardise) via the same pipeline
interface shown in the previous vignette,
* returns a `bi_projector` – an S3 class that stores loadings `v`,
scores `s`, singular values `sdev` plus the fitted pre-processor.
That means immediate access to verbs such as `project()`,
`reconstruct()`, `truncate()`, `partial_project()`.
```{r svd_wrapper_example}
set.seed(1)
X <- matrix(rnorm(35*10), 35, 10) # 35 obs × 10 vars
sv_fast <- svd_wrapper(X, ncomp = 5, preproc = center(), method = "fast")
# irlba backend (if installed) gives identical results
sv_irlba <- if (requireNamespace("irlba", quietly = TRUE)) {
suppressWarnings(svd_wrapper(X, ncomp = 5, preproc = center(), method = "irlba"))
}
# Same downstream code works for both objects:
head(scores(sv_fast)) # 35 × 5
if (!is.null(sv_irlba)) {
all.equal(scores(sv_fast), scores(sv_irlba))
}
```
# 2. A one-liner `pca()`
Most people really want PCA, so `pca()` is a thin wrapper that
1. calls `svd_wrapper()` with sane defaults,
2. adds the S3 class "pca" (printing, screeplot, biplot, permutation test, …).
```{r pca_example}
data(iris)
X_iris <- as.matrix(iris[, 1:4])
pca_fit <- pca(X_iris, ncomp = 4) # defaults to method = "fast", preproc=center()
print(pca_fit)
```
## 2.1 Scree-plot and cumulative variance
```{r pca_screeplot}
screeplot(pca_fit, type = "lines", main = "Iris PCA – scree plot")
```
## 2.2 Quick biplot
```{r pca_biplot, warning=FALSE, message=FALSE}
# Requires ggrepel for repulsion, but works without it
biplot(pca_fit, repel_points = TRUE, repel_vars = TRUE)
```
(If you do not have ggrepel installed the text is placed without repulsion.)
# 3. What is a `bi_projector`?
Think bidirectional mapping:
```
data space (p variables) ↔ component space (d ≤ p)
new samples: project() ← scores
new variables: project_vars() ← loadings
reconstruction ↔ (scores %*% t(loadings))
```
A `bi_projector` therefore carries
| slot | shape | description |
|-----------|-------|----------------------------------------------------|
| `v` | p × d | component loadings (columns) |
| `s` | n × d | score matrix (rows = observations) |
| `sdev` | d | singular values (or SDs related to components) |
| `preproc` | – | fitted transformer so you never leak training stats |
Because `pca()` returns a `bi_projector`, you get other methods for free:
```{r biprojector_methods}
# rank-2 reconstruction of the iris data
Xhat2 <- reconstruct(pca_fit, comp = 1:2)
print(paste("MSE (rank 2):", round(mean((X_iris - Xhat2)^2), 4))) # MSE ~ 0.076
# drop to 2 PCs everywhere
pca2 <- truncate(pca_fit, 2)
shape(pca2) # 4 vars × 2 comps
```
# 4. Fast code-coverage cameo
The next chunk quietly touches a few more branches used in the unit tests
(`std_scores()`, `perm_test()`, `rotate()`), but keeps printing to a minimum:
```{r code_coverage}
# std scores
head(std_scores(svd_wrapper(X, ncomp = 3))) # Use the earlier X data
# tiny permutation test (10 perms; obviously too few for science)
# This requires perm_test.pca method
# Make sure X_iris is centered if perm_test needs centered data
perm_res <- perm_test(pca_fit, X_iris, nperm = 10, comps = 2)
print(perm_res$component_results)
# quick varimax rotation
if (requireNamespace("GPArotation", quietly = TRUE)) {
pca_rotated <- rotate(pca_fit, ncomp = 3, type = "varimax")
print(pca_rotated)
} else {
cat("GPArotation not installed, skipping rotation example.\n")
}
```
(Running these once in the vignette means they are also executed by `R CMD check`, bumping test-coverage without extra scaffolding.)
# 5. Take-aways
* `svd_wrapper()` gives you a unified front end to half-a-dozen SVD engines.
* `pca()` piggy-backs on that, returning a fully featured `bi_projector`.
* The `bi_projector` contract means the same verbs & plotting utilities
work for any decomposition you wrap into the framework later.
---
# Session info
```{r session_info_svd}
sessionInfo()
```