library(cinaR)
data("atac_seq_consensus_bm")
Bed formatted consensus matrix (chr, start, end and samples)
dim(bed)
## [1] 1000 25
# bed formatted file
head(bed[,1:4])
## Chr Start End B6-18mo-M-BM-47-GT18-01783
## 52834 chr5 24841478 24845196 1592
## 29780 chr17 8162955 8164380 109
## 67290 chr8 40577584 40578029 72
## 51295 chr4 145277698 145278483 110
## 4267 chr1 180808752 180815472 2452
## 45102 chr3 88732151 88732652 49
Create the contrasts you want to compare, here we create contrasts for 22 mice samples from different strains.
# create contrast vector which will be compared.
<- c("B6", "B6", "B6", "B6", "B6", "NZO", "NZO", "NZO", "NZO", "NZO", "NZO",
contrasts"B6", "B6", "B6", "B6", "B6", "NZO", "NZO", "NZO", "NZO", "NZO", "NZO")
cinaR
function directly computes the differentially accessible peaks.
# If reference genome is not set hg38 will be used!
<- cinaR(bed, contrasts, reference.genome = "mm10") results
## >> preparing features information... 2022-05-17 10:45:12
## >> identifying nearest features... 2022-05-17 10:45:12
## >> calculating distance from peak to TSS... 2022-05-17 10:45:13
## >> assigning genomic annotation... 2022-05-17 10:45:13
## >> assigning chromosome lengths 2022-05-17 10:45:27
## >> done... 2022-05-17 10:45:27
Now, you can access differential accessibility (DA) and enrichment results.
names(results)
## [1] "DA.results" "Enrichment.Results"
Inside DA.results
, you have the consensus peaks (cp) and differentially accessible (DA) peaks. If batch correction was run, then cp
will be a batch-corrected consensus matrix, otherwise it is the filtered and normalized version of original consensus peaks you provided.
names(results$DA.results)
## [1] "cp" "DA.peaks"
There are many information cinaR
provides such as adjusted p value, log fold-changes, gene names etc for each peak:
colnames(results$DA.results$DA.peaks$B6_NZO)
## [1] "Row.names" "seqnames" "start" "end"
## [5] "width" "strand" "annotation" "geneChr"
## [9] "geneStart" "geneEnd" "geneLength" "geneStrand"
## [13] "geneId" "transcriptId" "distanceToTSS" "gene_name"
## [17] "logFC" "FDR"
Here is an overview of those DA peaks:
head(results$DA.results$DA.peaks$B6_NZO[,1:5])
## Row.names seqnames start end width
## 1 chr10_105840598_105842176 chr10 105840598 105842176 1579
## 2 chr10_59950325_59952673 chr10 59950325 59952673 2349
## 3 chr10_63176490_63176839 chr10 63176490 63176839 350
## 4 chr10_77220928_77221910 chr10 77220928 77221910 983
## 5 chr10_79751429_79751786 chr10 79751429 79751786 358
## 6 chr10_86021157_86023861 chr10 86021157 86023861 2705
Since the comparison is
B6_NZO
, if fold-changes are positive it means they are more accesible in B6 compared to NZO and vice versa for negative values!
and here is a little overview for enrichment analyses results:
head(results$Enrichment.Results$B6_NZO[,c("module.name","overlapping.genes", "adj.p")])
## module.name overlapping.genes adj.p
## 1 Myeloid lineage 1 TFEB,FBXL5,PLXNC1,GM2A,AGTPBP1,CTSB 0.05914491
## 2 U_metabolism/replication SLC2A6,GM2A,CTSB,PECAM1 0.05914491
## 3 U_mitochondrial proteins PIK3R1,PAQR3,UBE3A,MAP4K4,PTPRC 0.32816305
## 4 U_proteasome/ubiquitin cx PIK3R1,IREB2,PTPRC 0.39112517
## 5 U_Immunity/cytoskeleton RPS6,RPS19 0.66488512
## 6 Myeloid lineage 2 RNF157,MTUS1 0.66488512
You can easily get the PCA plots of the samples:
pca_plot(results, contrasts, show.names = F)
You can overlay different information onto PCA plots as well!
# Overlaid information
<- c("B6-18mo", "B6-18mo", "B6-18mo", "B6-18mo", "B6-18mo",
overlaid.info "NZO-18mo", "NZO-18mo", "NZO-18mo", "NZO-18mo", "NZO-18mo", "NZO-18mo",
"B6-3mo", "B6-3mo", "B6-3mo", "B6-3mo", "B6-3mo",
"NZO-3mo", "NZO-3mo", "NZO-3mo", "NZO-3mo", "NZO-3mo", "NZO-3mo")
# Sample IDs
<- c("S01783", "S01780", "S01781", "S01778", "S01779",
sample.names "S03804", "S03805", "S03806", "S03807", "S03808",
"S03809", "S04678", "S04679", "S04680", "S04681",
"S04682", "S10918", "S10916", "S10919", "S10921",
"S10917", "S10920")
pca_plot(results, overlaid.info, sample.names)
## Warning: ggrepel: 7 unlabeled data points (too many overlaps). Consider
## increasing max.overlaps
You can see the available comparisons using:
show_comparisons(results)
## [1] "B6_NZO"
Then, plot the differential peaks for a selected contrast using:
heatmap_differential(results, comparison = "B6_NZO")