Demo with scanpy

We’ve found that by using anndata for R, interacting with other anndata-based Python packages becomes super easy!

WARNING: The outputs of this vignette are not rendered on CRAN due to package size limitations. Please check the Demo with scanpy vignette in the package documentation.

Set up

To use another Python package (e.g. scanpy), you need to make sure that it is installed in the same ephemeral Python environment that anndata uses. You can let reticulate handle this for you by using the py_require() function:

library(anndata)
library(reticulate)
py_require("scanpy")

TIP: Check out the vignette on setting up Python package environments with reticulate: https://rstudio.github.io/reticulate/articles/python_packages.html.

Download and load dataset

Let’s use a 10x dataset from the 10x genomics website. You can download it to an anndata object with scanpy as follows:

sc <- import("scanpy")

url <- "https://cf.10xgenomics.com/samples/cell-exp/6.0.0/SC3_v3_NextGem_DI_CellPlex_CSP_DTC_Sorted_30K_Squamous_Cell_Carcinoma/SC3_v3_NextGem_DI_CellPlex_CSP_DTC_Sorted_30K_Squamous_Cell_Carcinoma_count_sample_feature_bc_matrix.h5"

ad <- sc$read_10x_h5("dataset.h5", backup_url = url)

ad

Preprocessing dataset

The resuling dataset is a wrapper for the Python class but behaves very much like an R object:

ad[1:5, 3:5]
dim(ad)

But you can still call scanpy functions on it, for example to perform preprocessing.

sc$pp$filter_cells(ad, min_genes = 200)
sc$pp$filter_genes(ad, min_cells = 3)
sc$pp$normalize_per_cell(ad)
sc$pp$log1p(ad)

Analysing your dataset in R

You can seamlessly switch back to using your dataset with other R functions, for example by calculating the rowMeans of the expression matrix.

library(Matrix)
rowMeans(ad$X[1:10, ])