Software used to design guide RNA sequences for CRISPR/Cas9 genome editing
This package aims to provide scientifically pertinent information when designing guide RNA sequences for Cas9 genome editing. When provided a target DNA sequence for editing, a genome to check for off-targets in, and a genome annotation file (.gtf), it will output information for two separate data tables. The first table contains information on the generated sgRNA (sgRNA sequence, PAM, Direction, Start, End, GC content, Presence of Homopolymers, Regions of Self Complementarity, Effciency Score (Doench 2016), Number of Potential Off-Target sequences, and Notes on the sgRNA). The second table contains information on the found off-target sequences (Original sgRNA Sequence, Chromosome, Start, End, Number of Mismatches, Direction, CFD Scores, Matched Sequence, Gene ID, Gene Name, Sequence Type, and Exon Number). This data can be generated through the command line or through crispRdesignR’s GUI. Additionally, a user may provide their own DNA libraries and genome annotation files when searching for off-targets.
For more information, please see our article in the Journal of Genomics: http://www.jgenomics.com/v08p0062.htm
install.packages("crispRdesignR")
Example Data is located in /inst/ folder.
The DAK1.fasta and DAK1_short.txt file contains a DNA sequence native to the DAK1 gene that can be copied and pasted into crispRdesignR or uploaded as a file (in the GUI version).
The “Saccharomyces_cerevisiae.R64-1-1.92.gtf.gz” file is the compressed genome annotation file for Saccharomyces cerevisiae. Both compressed and uncompressed .gtf files can be used.
Using the GUI version:
library(crispRdesignR)
crispRdesignRUI()
Click on the “Use FASTA or txt file as target sequence” button and choose the DAK1.fasta or DAK1_short.txt file, or copy and paste the sequence in the box.
select the Saccharomyces cerevisiae genome
browse to choose the .gtf file Saccharomyces_cerevisiae.R64-1-1.92.gtf.gz
click on the Find sgRNA button
Additional Genome annotation files can be found here: https://useast.ensembl.org/info/data/ftp/index.html
Note: Even though it might be possible to select them in the GUI,
Genomes must be installed (with
install.packages(BSgenome.yourgenome)
) before they can be
used in the shiny app.
library(crispRdesignR)
and
ensure that your genome of interest is installed. The following commands
can be used to output the same data tables as in the GUI version.All data can be generated without the graphic interface by using a
single function:
sgRNA_design(userseq, genomename, gtfname, userPAM, calloffs = TRUE, annotateoffs = TRUE)
userseq
: The target sequence to generate sgRNA guides
for. Can either be a character sequence containing DNA bases or the name
of a fasta/text file in the working directory.
genomename
: The name of a genome (in BSgenome format) to
check for off-targets in. These genomes can be downloaded through
BSgenome or compiled by the user.
gtfname
: The name of a genome annotation file (.gtf) in
the working directory to check off-target sequences against.
userPAM
: An optional argument used to set a custom PAM
for the sgRNA. If not set, the function will default to the “NGG” PAM.
Warning: Doench efficieny scores are only accurate for the “NGG”
PAM.
calloffs
: If TRUE, the function will search for
off-targets in the genome chosen specified by the genomename argument.
If FALSE, off-target calling will be skipped.
annotateoffs
: If TRUE, the function will provide
annotations for the off-targets called using the genome annotation file
specified by the gtfname argument. If FALSE, off-target annotation will
be skipped.
The data on the generated sgRNA sequences can be retrieved with:
getsgRNAdata(x)
x
: The raw data generated by
sgRNA_design()
The additional off-target data can be retrieved with
getofftargetdata(x)
x
: The raw data generated by
sgRNA_design()
testseq <- "GGCAGAGCTTCGTATGTCGGCGATTCATCTCAAGTAGAAGATCCTGGTGCAGTAGG" usergenome <- BSgenome.Scerevisiae.UCSC.sacCer2::BSgenome.Scerevisiae.UCSC.sacCer2 gtfname <- "Saccharomyces_cerevisiae.R64-1-1.92.gtf.gz" annotation_file <- system.file("example_data", gtfname, package = "crispRdesignR") alldata <- sgRNA_design(testseq, usergenome, annotation_file)
sgRNAdata <- getsgRNAdata(alldata)
offtargetdata <- getofftargetdata(alldata)
exampledata <- sgRNA_design("DAK1.fasta", BSgenome.Scerevisiae.UCSC.sacCer2, "Saccharomyces_cerevisiae.R64-1-1.92.gtf.gz", "NAG", calloffs = TRUE, annotateoffs = FALSE)
Example Data is located in /inst/ folder. To use the DAK1.fasta file, place it in the working directory and refer to it in crispRdesignR. The DAK1_short.txt file contains a short DNA sequence that can be copied and pasted into crispRdesignR. Both sequences are native to the DAK1 gene in Saccharomyces cerevisiae. The “Saccharomyces_cerevisiae.R64-1-1.92.gtf.gz” file is a genome annotation file for Saccharomyces cerevisiae and must also be placed in the working directory (when using the command line version).