The development of cpp11armadillo emerges from the desire to follow a simplified approach towards R and C++ integration by building on top of cpp11, a ground up rewrite of C++ bindings to R with different design trade-offs and features. cpp11armadillo aims at providing an additional layer to put the end-user focus on the computation instead of configuration (Vaughan, Hester, and François 2023).
Armadillo is a linear algebra library for the C++ language, aiming towards a good balance between speed and ease of use. It is justified in the fact that C++, in its current form, is very valuable to address bottlenecks that we find with interpreted languages such as R and Python but it does not provide data structures nor functions for linear algebra (Sanderson and Curtin 2016).
RcppArmadillo was first published to CRAN in 2010, and it allows to use Armadillo via Rcpp, a widely extended R package to call C++ functions from R (Eddelbuettel and Sanderson 2014).
The design choices in cpp11armadillo are:
These ideas reflect a comprehensive effort to provide an efficient interface for integrating C++ and R that aligns with the Tidy philosophy (Wickham et al. 2019), addressing both the technical and community-driven aspects that influence software evolution.
These choices have advantages and disadvantages. A disadvantage is that cpp11armadillo will not convert data types automatically, the user must be explicit about data types, especially when passing data from R to C++ and then exporting the final computation back to R. An advantage is that cpp11armadillo codes, including its internal templates, can be adapted to work with Python via pybind11 (Jakob, Rhinelander, and Moldovan 2016).
cpp11armadillo uses Hansen (2022) notation, meaning that matrices are column-major and vectors are expressed as column vectors (i.e., \(N\times1\) matrices).
Convention: input R matrices are denoted by x
, y
, z
, and output or intermediate C++ matrices are denoted by X
, Y
, Z
. The example functions can be called from R scripts and should have proper headers as in the following code:
#include <armadillo.hpp>
#include <cpp11.hpp>
#include <cpp11armadillo.hpp>
using namespace arma;
using namespace cpp11;
cpp11::register]] // allows using the function in R
[[
doubles_matrix<> solve_mat(doubles_matrix<> x) {double> Y = as_Mat(x); // convert from R to C++
Mat<double> Yinv = inv(Y); // Y^(-1)
Mat<return as_doubles_matrix(Yinv); // convert from C++ to R
}
This example includes the Armadillo, cpp11 and cpp11armadillo libraries, and allows interfacing C++ with R (i.e., the #include <cpp11.hpp>
). It also loads the corresponding namespaces (i.e., the using namespace cpp11
) in order to simplify the notation (i.e., using Mat
instead of arma::Mat
).
The as_Mat()
function is provided by cpp11armadillo to pass a matrix
object from R to C++ and that Armadillo can read.
The as_doubles_matrix()
function is also provided by cpp11armadillo to pass a Mat<double>
or Mat<int>
object from C++ to R.
Given a design matrix \(X\) and and outcome vector \(y\), one function to obtain the OLS estimator \(\hat{\beta} = (X^tX)^{-1}(X^tY)\) as a matrix (i.e., column vector) is:
const doubles_matrix<>& y, const doubles_matrix<>& x) {
doubles_matrix<> ols_mat(double> Y = as_Mat(y);
Mat<double> X = as_Mat(x);
Mat<
double> XtX = X.t() * X; // X'X
Mat<double> XtX_inv = inv(XtX); // (X'X)^(-1)
Mat<double> beta = XtX_inv * X.t() * Y; // (X'X)^(-1)(X'Y)
Mat<
return as_doubles_matrix(beta);
}
// or
const doubles& y, const doubles_matrix<>& x) {
doubles ols_dbl(// Y either as Mat or Col
double> Y = as_Mat(y);
Mat<// Col<double> Y = as_Col(y);
double> X = as_Mat(x);
Mat<
double> XtX = X.t() * X;
Mat<double> XtX_inv = inv(XtX);
Mat<double> beta = XtX_inv * X.t() * Y;
Mat<
return as_doubles(beta);
}
The ols_mat()
function receives inputs from R, does the computation on C++ side. The use of const
and &
are specific to the C++ language and allow to pass data from R to C++ while avoiding copying the data, therefore saving time and memory.
Another option is to express the same computation as a vector:
const doubles_matrix<>& y, const doubles_matrix<>& x) {
doubles ols_dbl(double> Y = as_Mat(y);
Mat<double> X = as_Mat(x);
Mat<
double> XtX = X.t() * X;
Mat<double> XtX_inv = inv(XtX);
Mat<double> beta = XtX_inv * X.t() * Y;
Mat<
return as_doubles(beta);
}
Given a matrix \(X\), an eigenvalue of \(X\) is any number \(t\) such that \(Xv = tv\), where \(v\) is a non-zero vector.
Two functions to obtain the eigenvalues of a symmetric matrix are:
const doubles_matrix<>& x) {
doubles_matrix<> eigen_sym_mat(double> X = as_Mat(x);
Mat<double> y = eig_sym(X);
Mat<return as_doubles_matrix(y);
}
const doubles_matrix<>& x) {
doubles eigen_sym_dbl(double> X = as_Mat(x);
Mat<double> y = eig_sym(X);
Mat<return as_doubles(y);
}
When the input matrix is not symmetric, the eigenvalues are complex numbers. cpp11armadillo provides a wrapper to return the real and imaginary parts of the eigenvalues as a list:
const doubles_matrix<>& x) {
list eigen_gen_mat(double> X = as_Mat(x);
Mat<double>> y = eig_gen(X);
Mat<complex<
list out = as_complex_matrix(y);return out;
}
The idea of cpp11armadillo is to integrate with cpp11 to provide flexibility, and the previous result can be obtained without using the as_complex_matrix()
wrapper:
const doubles_matrix<>& x) {
list eigen_gen_mat(double> X = as_Mat(x);
Mat<
double>> y = eig_gen(X);
Mat<complex<
double> y_real = real(y);
Mat<double> y_imag = imag(y);
Mat<
writable::list out;"real"_nm = as_doubles_matrix(y_real)});
out.push_back({"imag"_nm = as_doubles_matrix(y_imag)});
out.push_back({
return out;
}
As with OLS, in this case it is possible to return a list of vectors instead of a matrices:
cpp11::register]]
[[const doubles_matrix<>& x) {
list eigen_gen_dbl(double> X = as_Mat(x);
Mat<double>> y = eig_gen(X);
Col<complex<
list out = as_complex_doubles(y);return out;
}
One function to decompose a matrix \(X\) into an upper (lower) triangular matrix \(R\) such that \(X = R^tR\) (\(X = RR^t\)) is:
const doubles_matrix<>& x, std::string type) {
doubles_matrix<> chol_mat(double> X = as_Mat(x);
Mat<
double> res;
Mat<
if (type == "upper") {
res = chol(X);else if (type == "lower") {
} "lower");
res = chol(X, else {
} "Invalid type");
stop(
}
return as_doubles_matrix(res);
}
This function uses the second argument to avoid creating a new function for the lower triangular decomposition.
One function to decompose a matrix \(X\) into a product \(X=QR\) where \(Q\) is an orthogonal matrix and \(R\) is an upper triangular matrix is:
const doubles_matrix<>& x, bool econ) {
list qr_mat(double> X = as_Mat(x);
Mat<
double> Q;
Mat<double> R;
Mat<
bool computable;
if (!econ) {
computable = qr(Q, R, X);else {
}
computable = qr_econ(Q, R, X);
}
if (!computable) {
"QR decomposition failed");
stop(else {
}
writable::list out;"Q"_nm = as_doubles_matrix(Q)});
out.push_back({"R"_nm = as_doubles_matrix(R)});
out.push_back({return out;
} }
In this case, the function returns a list of matrices after using the second argument to use two different Armadillo implementations of the QR decomposition.
Suppose there are two risky stocks, Pepsi and Coca-Cola, with returns for the last three days given by a matrix \(R\) in a market with a risk-free bond with a return \(f=2\%\) and a risky market portfolio with returns for the last three days given by a vector \(m\). The CAPM equation is given by \(R_i = f + \beta_i(E(m) - f)\), where \(\beta_i = \frac{\text{cov}(R_i, m)}{\text{var}(m)}\) (Bodie, Kane, and Marcus 2010).
The simulated returns are given by:
set.seed(200100)
0.02
f <- matrix(rnorm(3, 0, 0.1), nrow = 3, ncol = 1)
m <- matrix(rnorm(6, 0, 0.1), nrow = 3, ncol = 2) r <-
One function to obtain the CAPM betas for the two stocks is:
const doubles_matrix<>& r, const doubles_matrix<>& m,
doubles_matrix<> capm(double f) {
double> R = as_Mat(r);
Mat<double> M = as_Mat(m);
Mat<double> F = ones<Mat<double>>(R.n_cols, 1) * f;
Mat<
// Market average return and CAPM betas
double> M_avg = ones<Mat<double>>(R.n_cols, 1) * as_scalar(mean(M, 0));
Mat<double> beta = cov(R, M) / as_scalar(var(M));
Mat<
// Expected returns (% = pairwise multiplication)
double> out = F + beta % (M_avg - F);
Mat<
return as_doubles_matrix(out);
}
The Leontief Input-Output model posits that the equilibrium production in an economy (i.e., country) with \(N\) industries (i.e., sectors) is given by \((I-A)^{-1}\), where \(I\) is the identity matrix and \(A\) contains the input requirement for each industry \(1\leq i \leq N\) from any other \(1 \leq j \leq N-1\) industry. \(A\) is determined by a transaction matrix \(X\) that reflects the final transactions between industries and an aggregate demand vector \(d\) (Leontief 1986).
We can use the data from the leontief package as data inputs for the example.
library(leontief)
transaction_matrix
X <- wage_demand_matrix[, "final_total_demand"] d <-
One function to obtain the Leontief inverse is:
const doubles_matrix<>& x,
doubles_matrix<> leontief_inverse(const doubles_matrix<>& d) {
double> X = as_Mat(x);
Mat<double> D = as_Mat(d);
Mat<
// Check dimensions
bool X_square = (X.n_rows == X.n_cols);
bool compatible_dimensions = (X.n_rows == D.n_rows);
if (X_square == false) {
"x must be square.");
stop(
}if (compatible_dimensions == false) {
"d must have the same number of elements as the number of rows in x.");
stop(
}
// Input requirement matrix
double> Y = zeros(D.n_rows, D.n_rows);
Mat<
Y.each_col() += D;double> A = X / Y.t();
Mat<
double> I = eye(X.n_rows, X.n_cols); // Identity matrix
Mat<
double> out = inv(I - A); // Leontief inverse
Mat<
return as_doubles_matrix(out);
}
A proper benchmark is to compute eigenvalues for large matrices. Both cpp11armadillo and RcppArmadillo use Armadillo as a backend, and the marginal observed differences are because of how cpp11 and Rcpp pass data from R to C++ and viceversa. The computation times are identical.
Input | Median time cpp11armadillo | Median time RcppArmadillo |
---|---|---|
500x500 | 35.07ms | 36.4ms |
1000x1000 | 260.28ms | 263.21ms |
1500x1500 | 874.62ms | 857.31ms |
2000x2000 | 2.21s | 2.21s |
Input | Memory allocation cpp11armadillo | Memory allocation RcppArmadillo |
---|---|---|
500x500 | 17.1KB | 4.62MB |
1000x1000 | 21KB | 4.62MB |
1500x1500 | 24.9KB | 4.63MB |
2000x2000 | 28.8KB | 4.63MB |
The cpp11armadillo computation was obtained with the eigen_sym_mat()
function already shown.
The RcppArmadillo computation was obtained with the following function:
#include <RcppArmadillo.h>
// [[Rcpp::depends(RcppArmadillo)]]
using namespace Rcpp;
// [[Rcpp::export]]
const arma::mat& x) {
arma::mat eigen_sym_mat(
arma::mat y = eig_sym(x);return y;
}
In order to get the RcppArmadillo function to work, we had to dedicate time to search online about the error function 'enterRNGScope' not provided by package 'Rcpp'
, which required to load additional headers in our code on R side.
RcppArmadillo has been and will continue to be widely successful. cpp11armadillo is a alternative templated implementation with different design trade-offs and features. Both packages can co-exist and continue to enrich the R community.
Bodie, Z., A. Kane, and A. Marcus. 2010. Investments. McGraw-Hill Education.
Eddelbuettel, Dirk, and Conrad Sanderson. 2014. “RcppArmadillo: Accelerating R with High-Performance C++ Linear Algebra.” Computational Statistics & Data Analysis 71 (March): 1054–63. https://doi.org/10.1016/j.csda.2013.02.005.
Hansen, Bruce. 2022. Econometrics. Princeton University Press.
Jakob, Wenzel, Jason Rhinelander, and Dean Moldovan. 2016. Pybind11 — Seamless Operability Between C++11 and Python. https://github.com/pybind/pybind11.
Leontief, Wassily. 1986. Input-Output Economics. Oxford University Press.
Sanderson, Conrad, and Ryan Curtin. 2016. “Armadillo: A Template-Based C++ Library for Linear Algebra.” Journal of Open Source Software 1 (2): 26. https://doi.org/10.21105/joss.00026.
Vaughan, Davis, Jim Hester, and Romain François. 2023. Cpp11: A C++11 Interface for R’s c Interface. https://CRAN.R-project.org/package=cpp11.
Wickham, Hadley, Mara Averick, Jennifer Bryan, Winston Chang, Lucy D’Agostino McGowan, Romain François, Garrett Grolemund, et al. 2019. “Welcome to the Tidyverse.” Journal of Open Source Software 4 (43): 1686. https://doi.org/10.21105/joss.01686.