---
title: 'Multiblock basics: one projector, many tables'
output: rmarkdown::html_vignette
vignette: >
%\VignetteIndexEntry{Multiblock basics: one projector, many tables}
%\VignetteEngine{knitr::rmarkdown}
%\VignetteEncoding{UTF-8}
params:
family: red
css: albers.css
resource_files:
- albers.css
- albers.js
includes:
in_header: |-
---
```{r setup, include = FALSE}
knitr::opts_chunk$set(
collapse = TRUE,
comment = "#>",
fig.width = 7,
fig.height = 4
)
library(dplyr)
library(multivarious)
# Assuming necessary multiblock functions are loaded, e.g., via devtools::load_all()
```
# 1. Why multiblock?
Many studies collect several tables on the same samples – e.g.
transcriptomics + metabolomics, or multiple sensor blocks.
Most single-table reductions (PCA, ICA, NMF, …) ignore that structure.
`multiblock_projector` is a thin wrapper that keeps track of which
original columns belong to which block, so you can
* drop-in any existing decomposition (PCA, SVD, NMF, …)
* still know "these five loadings belong to block A, those three to block B"
* project or reconstruct per block effortlessly.
We demonstrate with a minimal two-block toy-set.
```{r data_multiblock}
set.seed(1)
n <- 100
pA <- 7; pB <- 5 # two blocks, different widths
XA <- matrix(rnorm(n * pA), n, pA)
XB <- matrix(rnorm(n * pB), n, pB)
X <- cbind(XA, XB) # global data matrix
blk_idx <- list(A = 1:pA, B = (pA + 1):(pA + pB)) # Named list is good practice
```
# 2. Wrap a single PCA as a multiblock projector
```{r build_multiblock}
# 2-component centred PCA (using base SVD for brevity)
preproc_fitted <- fit(center(), X)
Xc <- transform(preproc_fitted, X) # Centered data
svd_res <- svd(Xc, nu = 0, nv = 2) # only V (loadings)
mb <- multiblock_projector(
v = svd_res$v, # p × k loadings
preproc = preproc_fitted, # remembers centering
block_indices = blk_idx
)
print(mb)
```
## 2.1 Project the whole data
```{r project_multiblock_all}
scores_all <- project(mb, X) # n × 2
head(round(scores_all, 3))
```
## 2.2 Project one block only
```{r project_multiblock_block}
# Project using only data from block A (requires original columns)
scores_A <- project_block(mb, XA, block = 1)
# Project using only data from block B
scores_B <- project_block(mb, XB, block = 2)
cor(scores_all[,1], scores_A[,1]) # high (they coincide)
```
Because the global PCA treats all columns jointly, projecting only block A
gives exactly the same latent coordinates as when the whole matrix is
available – useful when a block is missing at prediction time.
## 2.3 Partial feature projection
Need to use just three variables from block B?
```{r project_multiblock_partial}
# Get the global indices for the first 3 columns of block B
sel_cols_global <- blk_idx[["B"]][1:3]
# Extract the corresponding data columns from the full matrix or block B
part_XB_data <- X[, sel_cols_global, drop = FALSE] # Data must match global indices
scores_part <- partial_project(mb, part_XB_data,
colind = sel_cols_global) # Use global indices
head(round(scores_part, 3))
```
# 3. Adding scores → multiblock_biprojector
If you also keep the sample scores (from the original fit) you get two-way functionality:
re-construct data, measure error, run permutation tests, etc. That is one
extra line when creating the object:
```{r build_biprojector}
bi <- multiblock_biprojector(
v = svd_res$v,
s = Xc %*% svd_res$v, # Calculate scores: Xc %*% V
sdev = svd_res$d[1:2] / sqrt(n-1), # SVD d are related to sdev
preproc = preproc_fitted,
block_indices = blk_idx
)
print(bi)
```
Now you can, for instance, test whether component-wise consensus
between blocks is stronger than by chance.
```{r perm_test_multiblock}
# Quick permutation test (use more permutations for real analyses)
# use_rspectra=FALSE needed for this 2-block example; larger problems can use TRUE
perm_res <- perm_test(bi, Xlist = list(A = XA, B = XB), nperm = 99, use_rspectra = FALSE)
print(perm_res$component_results)
```
The `perm_test` method for `multiblock_biprojector` uses an eigen-based score consensus
statistic to assess whether blocks share more variance than expected by chance.
# 4. Take-aways
* Any decomposition that delivers a loading matrix `v` (and
optionally scores `s`) can become multiblock-aware by supplying
`block_indices`.
* The wrapper introduces zero new maths – it only remembers the column
grouping and plugs into the common verbs:
| Verb | What it does in multiblock context |
|-----------------------|--------------------------------------------------------|
| `project()` | whole-matrix projection (uses preprocessing) |
| `project_block()` | scores based on one block's data |
| `partial_project()` | scores from an arbitrary subset of global columns |
| `coef(..., block=)` | retrieve loadings for a specific block |
| `perm_test()` | permutation test for block consensus (biprojector) |
This light infrastructure lets you prototype block-aware analyses
quickly, while still tapping into the entire `multiblock` toolkit
(cross-validation, reconstruction metrics, composition with
`compose_projector`, etc.).
```{r sessionInfo}
sessionInfo()
```