Most of the RNA-seq experiments focus on bulk RNA-seq methods. However, after closely looking at single cell datasets, the information obtained from single-cell experiments can throw light on variety of underlying biological processes. Here, I downloaded publicly available PMBC dataset to show how this package can be used to run Seurat analyses with minimum parameters list. The reason to create this package is to arrange the Seurat functionality into modules for easy preprocessing, and cell clustering.
In addition, the module can also be used for batch correction, cell-type annotation, and to transfer annotations from reference dataset to the current dataset (using SingleR and celldex annotations).
10X PBMC data can be found here: https://cf.10xgenomics.com/samples/cell/pbmc3k/pbmc3k_filtered_gene_bc_matrices.tar.gz
Using SeuratPreprocess and SeuratLowDim, the user can go directly from counts data, to clusters identified (t-SNE, UMAP).
## Normalizing layer: counts
## Finding variable features for layer counts
## Regressing out percent.mt
## Centering and scaling data matrix
## PC_ 1
## Positive: RPS27A, MALAT1, IL32, LDHB, LTB
## Negative: CST3, FTL, AIF1, TYROBP, FTH1
## PC_ 2
## Positive: RPS2, RPS5, RPL10A, LTB, LDHB
## Negative: NKG7, GZMB, PRF1, GZMA, CST7
## PC_ 3
## Positive: S100A4, IL7R, CD3E, TMSB4X, LDHB
## Negative: CD79A, MS4A1, HLA-DQA1, CD79B, HLA-DQB1
## PC_ 4
## Positive: FCGR3A, MS4A7, CDKN1C, CKB, RP11-290F20.3
## Negative: S100A8, LGALS2, CD14, MS4A6A, S100A9
## PC_ 5
## Positive: ACTG1, PTTG1, TPM4, GAPDH, STMN1
## Negative: MALAT1, SPON2, FGFBP2, FCGR3A, PLAC8
## Computing nearest neighbor graph
## Computing SNN
## Modularity Optimizer version 1.3.0 by Ludo Waltman and Nees Jan van Eck
##
## Number of nodes: 1471
## Number of edges: 56832
##
## Running Louvain algorithm...
## Maximum modularity in 10 random starts: 0.5831
## Number of communities: 15
## Elapsed time: 0 seconds
## Warning: The default method for RunUMAP has changed from calling Python UMAP via reticulate to the R-native UWOT using the cosine metric
## To use Python UMAP via reticulate, set umap.method to 'umap-learn' and metric to 'correlation'
## This message will be shown once per session
## 14:53:49 UMAP embedding parameters a = 0.2734 b = 1.622
## 14:53:49 Read 1471 rows and found 12 numeric columns
## 14:53:49 Using Annoy for neighbor search, n_neighbors = 30
## 14:53:49 Building Annoy index with metric = cosine, n_trees = 50
## 0% 10 20 30 40 50 60 70 80 90 100%
## [----|----|----|----|----|----|----|----|----|----|
## **************************************************|
## 14:53:49 Writing NN index file to temp file C:\Users\vishs\AppData\Local\Temp\RtmpQrtlR8\file14ec3cda3ed2
## 14:53:49 Searching Annoy index using 1 thread, search_k = 3000
## 14:53:49 Annoy recall = 100%
## 14:53:50 Commencing smooth kNN distance calibration using 1 thread with target n_neighbors = 30
## 14:53:51 Initializing from normalized Laplacian + noise (using RSpectra)
## 14:53:51 Commencing optimization for 500 epochs, with 59214 positive edges
## 14:53:55 Optimization finished
The output seurat_markers has both the markers list, as well as a subset of the markers list, that are highly confident, both of which are stored in list.
Plotting markers by clusters with a threshold of log2FC greater than 1, and using the top 10 genes for each cluster.
seurat_reactome has 3 items in the list(gsva_result = gsva_result, pathway_expression = pathway_expression, max_difference = max_difference)
Expression of different pathways e.g.,
Min, max and differential Expression of different pathways e.g.,