https://doi.org/10.32614/CRAN.package.bigalgebra
bigalgebra
provides fast linear algebra primitives that
operate seamlessly on base matrix
objects and
[bigmemory::big.matrix
] containers. The package wraps BLAS
and LAPACK routines with R-friendly helpers so that vector updates,
matrix products, and classic decompositions work the same way in memory
or on disk.
dset()
, dsub()
and ddot()
extend
familiar vector algebra to big.matrix
inputs. See the Level
1 BLAS-Style Helpers vignette.dgemm()
and dsymm()
expose Level 3 BLAS
routines for dense matrix multiplication with optional file-backed
outputs. Explore the Matrix
Wrapper Helpers vignette.dgeqrf()
, dpotrf()
,
dgeev()
, dgesdd()
) bring advanced
factorisations to large datasets. Walk through the LAPACK
Decompositions vignette.The package defines a number of global options that begin with
bigalgebra
:
Option Default value * bigalgebra.temp_pattern
with
default matrix_
* bigalgebra.tempdir
with
default tempdir
*
bigalgebra.mixed_arithmetic_returns_R_matrix
with default
TRUE
* bigalgebra.DEBUG
with default
FALSE
The bigalgebra.tempdir
option must be a function that
returns a temporary directory path used to store big matrix results of
BLAS and LAPACK operations. The default value is simply the base R
tempdir()
function.
The bigalgebra.temp_pattern
option is a name prefix for
file names of generated big matrix objects output as a result of BLAS
and LAPACK operations.
The bigalgebra.mixed_arithmetic_returns_R_matrix
option
determines whether arithmetic operations involving an R matrix or vector
and a big.matrix
matrix or vector return a big matrix (when
the option is FALSE
), or return a normal R matrix
(TRUE
).
The package is built, by default, with R’s native BLAS libraries, which use 32-bit signed integer indexing. The default build is limited to vectors of at most 2^31 − 1 entries and matrices with at most 2^31 − 1 rows and 2^31 − 1 columns (note that standard R matrices are limited to 2^31 − 1 total entries).
The package includes a reference BLAS implementation that supports 64-bit integer indexing, relaxing the limitation on vector lengths and matrix row and column limits. Installation of this package with the 64-bit reference BLAS implementation may be performed from the command-line install:
REFBLAS=1 R CMD INSTALL bigalgebra
where bigalgebra
is the source package (for example,
bigalgebra_0.9.0.tar.gz
).
The package may also be built with user-supplied external BLAS and
LAPACK libraries, in either 32- or 64-bit varieties. This is an advanced
topic that requires additional Makevars
modification, and
may include adjustment of the low-level calling syntax depending on the
library used.
Feel free to contact us for help installing and running the package.
This website, the unit tests, some C code fixes and improvements as well as these examples were created by F. Bertrand.
Maintainer: Frédéric Bertrand frederic.bertrand@lecnam.net.
You can install the released version of bigalgebra from CRAN with:
install.packages("bigalgebra")
You can install the development version of bigalgebra from GitHub with:
::install_github("fbertran/bigalgebra") devtools
The snippets below mirror the worked examples in the vignettes and show how the helpers behave with in-memory and file-backed matrices.
These helpers cover vector updates, reductions, and element-wise
transforms such as the in-place square root provided by
dsqrt()
.
library(bigmemory)
library(bigalgebra)
<- bigmemory::big.matrix(5, 1, init = 0)
x dset(ALPHA = 9, X = x)
dsqrt(X = x)
x[]#> [1] 3 3 3 3 3
<- bigmemory::big.matrix(5, 1, init = 1)
y dvcal(ALPHA = 0.5, X = x, BETA = 2, Y = y)
y[]#> [1] 3.5 3.5 3.5 3.5 3.5
dgemm()
<- bigmemory::big.matrix(5, 4, init = 1)
A <- bigmemory::big.matrix(4, 4, init = 2)
B <- bigmemory::big.matrix(5, 4, init = 0)
C
dgemm(A = A, B = B, C = C, ALPHA = 1, BETA = 0)
C[]#> [,1] [,2] [,3] [,4]
#> [1,] 8 8 8 8
#> [2,] 8 8 8 8
#> [3,] 8 8 8 8
#> [4,] 8 8 8 8
#> [5,] 8 8 8 8
set.seed(1)
<- matrix(rnorm(9), 3)
M <- crossprod(M)
SPD <- as.big.matrix(SPD)
SPD_big dpotrf(A = SPD_big)
#> [1] 0
<- SPD_big[,]
chol_factor lower.tri(chol_factor)] <- 0
chol_factor[
chol_factor#> [,1] [,2] [,3]
#> [1,] 1.060398 -0.2388263 -0.6138286
#> [2,] 0.000000 1.8082109 0.2222424
#> [3,] 0.000000 0.0000000 0.8294922
big.matrix
workflows<- tempdir()
tmpdir <- filebacked.big.matrix(3, 3, init = diag(3),
file_big backingpath = tmpdir,
backingfile = "example.bin")
#> Warning in filebacked.big.matrix(3, 3, init = diag(3), backingpath = tmpdir, : No
#> descriptor file given, it will be named example.bin.desc
1, 3] <- 5
file_big[
file_big[]#> [,1] [,2] [,3]
#> [1,] 1 1 5
#> [2,] 1 1 1
#> [3,] 1 1 1
rm(file_big)
gc()
#> used (Mb) gc trigger (Mb) limit (Mb) max used (Mb)
#> Ncells 900616 48.1 1699095 90.8 NA 1377768 73.6
#> Vcells 2253326 17.2 8388608 64.0 65536 3247956 24.8
The full vignette set expands on the topics above and demonstrates how the routines interact: