What does CRISPR stand for?
“CRISPR” (pronounced “crisper”) stands for Clustered Regularly Interspaced Short Palindromic Repeats, which are the hallmark of a bacterial defense system which forms the basis for the popular CRISPR-Cas9 genome editing technology.
The ability to precisely edit the genome of a living cell holds enormous potential to accelerate life science research, improve biotechnology, and even treat human disease. Methods for genome editing — primarily zinc finger nucleases and Transcription Activator-Like Effector (TALE) Nucleases — have existed for several years, but in 2013 these were quickly eclipsed by the efficiency, effectiveness and precision of the engineered CRISPR-Cas9 system that was first harnessed for mammalian genome editing by Feng Zhang of the Broad Institute and MIT.
The CRISPR system
Like zinc fingers and TALEs, CRISPR systems are natural products. However, CRISPR-Cas differs from zinc fingers and TALEs in one crucial aspect that makes it superior for genome editing applications: whereas zinc fingers and TALEs bind to DNA through a direct protein-DNA interaction, requiring the protein to be redesigned for each new target DNA site, CRISPR-Cas achieves target specificity through a small RNA that can easily be swapped for other RNAs targeting new sites.
In nature, CRISPR-Cas systems help bacteria defend against attacking viruses (known as bacteriophage or just phage). They consist of two components, the CRISPR (clustered, regularly interspaced palindromic repeats) array and Cas (CRISPR-associated) proteins. CRISPR sequences bookend short stretches of DNA that bacteria have copied from invading phages, preserving a memory of the viruses that have attacked them in the past. These sequences are then transcribed into short RNAs that guide Cas proteins to matching viral sequences. The Cas proteins destroy the matching viral DNA by cutting it. There are a number of different types of CRISPR-Cas systems in nature, which vary in their components; the CRISPR-Cas9 system uses just a single protein, Cas9, to find and destroy target DNA. In 2015, Zhang and colleagues successfully harnessed a second system, called CRISPR-Cpf1, which has the potential for even simpler and more precise genome engineering.
Engineering the CRISPR toolbox
In early 2011, Feng Zhang was just starting his own research group at the Broad Institute and MIT, where he is an investigator at the McGovern Institute for Brain Research and a faculty member in the departments of Brain and Cognitive Sciences and Biological Engineering. After learning about existing CRISPR research at a scientific meeting at the Broad, he quickly realized that the system, with a single RNA-guided protein, could be a game changer in genome editing technology. He was already working on DNA targeting methods, having helped to develop the TALE system as a Junior Fellow at Harvard. This system could target and activate genes in mammalian genomes.
Zhang and his team focused on harnessing CRISPR-Cas9 for use in human cells. In January 2013, he reported the first successful demonstration of Cas9-based genome editing in human cells in what has become the most-cited CRISPR paper (Cong et al., Science, 2013). Researchers from George Church’s lab at Harvard University reported similar findings in the same issue of Science (Mali et al., Science, 2013). The Zhang and Church papers showed that Cas9 could be targeted to a specific location in the human genome and cut the DNA there. The cut DNA was then repaired by inserting a new stretch of DNA, supplied by the researchers, essentially achieving “find and replace” functionality in the human genome.
In September, 2015, Zhang and partners described a different system, Cpf1, which appears to have significant implications for research and therapeutics. The Cpf1 system is simpler in that it requires only a single RNA. The Cpf1 enzyme is also smaller than the standard SpCas9, making it easier to deliver into cells and tissues.
The CRISPR toolbox is continuing to expand rapidly, opening new avenues for biomedical research. Since the first publications in early 2013, the Zhang lab and other researchers have engineered a number of improvements to the system. For example, Cas9 has been modified so that instead of cutting the target DNA, it can turn gene expression on by recruiting transcriptional activators to its genomic location (Konermann, et al., Nature, 2014).
At the Broad Institute, the system has also been used for genome-wide screens to identify genes involved in resistance to cancer drugs and dissect immune regulatory networks. CRISPR has been used to rapidly create mouse models of cancer that arise from multiple gene alterations (Platt et al., Cell, 2014). In 2015, Zhang and his team reported success with Cas9 derived from a different bacterium, Staphylococcus aureus (SaCas9). SaCas9 is smaller than the original Cas9, which has advantages for gene therapy (Ran et al., Nature, 2015).
The Zhang lab has trained thousands of researchers in the use of CRISPR-Cas9 genome editing technology through direct education and by sharing more than 37,000 CRISPR-Cas9 components with academic laboratories around the world to help accelerate global research that will benefit human health. In September 2015, the Zhang lab also began to share Cpf1 components.
Users can obtain guide sequences for knock-outs and transcriptional activation as well as information about genome-wide libraries for CRISPR-based screening. To learn more, visit the Zhang Lab CRISPR Resources at http://www.genome-engineering.org/.
Q: What is “CRISPR”?
A: “CRISPR” (pronounced “crisper”) stands for Clustered Regularly Interspaced Short Palindromic Repeats, which are the hallmark of a bacterial defense system which forms the basis for the popular CRISPR-Cas9 genome editing technology. In the field of genome engineering, the term “CRISPR” is often used loosely to refer to the entire CRISPR-Cas9 system, which can be programmed to target specific stretches of genetic code and to edit DNA at precise locations. These tools allow researchers to permanently modify genes in living cells and organisms and, in the future, may make it possible to correct mutations at precise locations in the human genome to treat genetic causes of disease. In September 2015, the Zhang lab demonstrated successful harnessing of a different CRISPR system for genome editing, called CRISPR-Cpf1, which has the potential for even simpler and more precise genome engineering.
Q: Where do CRISPRs come from?
A: CRISPRs were first discovered in archaea (and later in bacteria), by Francisco Mojica, a scientists at the University of Alicante in Spain. He proposed that CRISPRs serve as part of the bacterial immune system, defending against invading viruses. They consist of repeating sequences of genetic code, interrupted by “spacer” sequences – remnants of genetic code from past invaders. The system serves as a genetic memory that helps the cell detect and destroy invaders (called “bacteriophage”) when they return. Mojica’s theory was experimentally demonstrated in 2007 by a team of scientists led by Philippe Horvath.
In January 2013, Feng Zhang at the Broad Institute and MIT published the first method to engineer CRISPR to edit the genome in mouse and human cells.
Q: How does the system work?
A: CRISPR “spacer” sequences are transcribed into short RNA sequences (“CRISPR RNAs” or “crRNA”) capable of guiding the system to matching sequences of DNA. When the target DNA is found, Cas9 – one of the enzymes produced by the CRISPR system – binds to the DNA and cuts it, shutting the targeted gene off. Using modified versions of Cas9, researchers can activate gene expression instead of cutting the DNA. These techniques allow researchers to study the gene’s function.
Research also suggests that CRISPR-Cas9 can be used to target and modify “typos” in the three-billion-letter sequence of the human genome in an effort to treat genetic disease.
A: CRISPR-Cas9 is proving to be an efficient and customizable alternative to other existing genome editing tools. Since the CRISPR-Cas9 system itself is capable of cutting DNA strands, CRISPRs do not need to be paired with separate cleaving enzymes as other tools do. They can also easily be matched with tailor-made “guide” RNA (gRNA) sequences designed to lead them to their DNA targets. Tens of thousands of such gRNA sequences have already been created and are available to the research community. CRISPR-Cas9 can also be used to target multiple genes simultaneously, which is another advantage that sets it apart from other gene-editing tools.
CRISPR-Cpf1 differs in several important ways from the previously described Cas9, with significant implications for research and therapeutics.
First: In its natural form, the DNA-cutting enzyme Cas9 forms a complex with two small RNAs, both of which are required for the cutting activity. The Cpf1 system is simpler in that it requires only a single RNA. The Cpf1 enzyme is also smaller than the standard SpCas9, making it easier to deliver into cells and tissues.
Second, and perhaps most significantly: Cpf1 cuts DNA in a different manner than Cas9. When the Cas9 complex cuts DNA, it cuts both strands at the same place, leaving ‘blunt ends’ that often undergo mutations as they are rejoined. With the Cpf1 complex the cuts in the two strands are offset, leaving short overhangs on the exposed ends. This is expected to help with precise insertion, allowing researchers to integrate a piece of DNA more efficiently and accurately.
Third: Cpf1 cuts far away from the recognition site, meaning that even if the targeted gene becomes mutated at the cut site, it can likely still be re-cut, allowing multiple opportunities for correct editing to occur.
Fourth: the Cpf1 system provides new flexibility in choosing target sites. Like Cas9, the Cpf1 complex must first attach to a short sequence known as a PAM, and targets must be chosen that are adjacent to naturally occurring PAM sequences. The Cpf1 complex recognizes very different PAM sequences from those of Cas9. This could be an advantage in targeting some genomes, such as in the malaria parasite as well as in humans.
CRISPR/Cas9 and Targeted Genome Editing: A New Era in Molecular Biology
The development of efficient and reliable ways to make precise, targeted changes to the genome of living cells is a long-standing goal for biomedical researchers. Recently, a new tool based on a bacterial CRISPR-associated protein-9 nuclease (Cas9) from Streptococcus pyogenes has generated considerable excitement (1). This follows several attempts over the years to manipulate gene function, including homologous recombination (2) and RNA interference (RNAi) (3). RNAi, in particular, became a laboratory staple enabling inexpensive and high-throughput interrogation of gene function (4, 5), but it is hampered by providing only temporary inhibition of gene function and unpredictable off-target effects (6). Other recent approaches to targeted genome modification – zinc-finger nucleases [ZFNs, (7)] and transcription-activator like effector nucleases [TALENs (8)]– enable researchers to generate permanent mutations by introducing doublestranded breaks to activate repair pathways. These approaches are costly and time-consuming to engineer, limiting their widespread use, particularly for large scale, high-throughput studies.
The Biology of Cas9
The functions of CRISPR (Clustered Regularly Interspaced Short Palindromic Repeats) and CRISPR-associated (Cas) genes are essential in adaptive immunity in select bacteria and archaea, enabling the organisms to respond to and eliminate invading genetic material. These repeats were initially discovered in the 1980s in E. coli (9), but their function wasn’t confirmed until 2007 by Barrangou and colleagues, who demonstrated that S. thermophilus can acquire resistance against a bacteriophage by integrating a genome fragment of an infectious virus into its CRISPR locus (10).
Three types of CRISPR mechanisms have been identified, of which type II is the most studied. In this case, invading DNA from viruses or plasmids is cut into small fragments and incorporated into a CRISPR locus amidst a series of short repeats (around 20 bps). The loci are transcribed, and transcripts are then processed to generate small RNAs (crRNA – CRISPR RNA), which are used to guide effector endonucleases that target invading DNA based on sequence complementarity (Figure 1) (11).
One Cas protein, Cas9 (also known as Csn1), has been shown, through knockdown and rescue experiments to be a key player in certain CRISPR mechanisms (specifically type II CRISPR systems). The type II CRISPR mechanism is unique compared to other CRISPR systems, as only one Cas protein (Cas9) is required for gene silencing (12). In type II systems, Cas9 participates in the processing of crRNAs (12), and is responsible for the destruction of the target DNA (11). Cas9’s function in both of these steps relies on the presence of two nuclease domains, a RuvC-like nuclease domain located at the amino terminus and a HNH-like nuclease domain that resides in the mid-region of the protein (13).
To achieve site-specific DNA recognition and cleavage, Cas9 must be complexed with both a crRNA and a separate trans-activating crRNA (tracrRNA or trRNA), that is partially complementary to the crRNA (11). The tracrRNA is required for crRNA maturation from a primary transcript encoding multiple pre-crRNAs. This occurs in the presence of RNase III and Cas9 (12).
During the destruction of target DNA, the HNH and RuvC-like nuclease domains cut both DNA strands, generating double-stranded breaks (DSBs) at sites defined by a 20-nucleotide target sequence within an associated crRNA transcript (11, 14). The HNH domain cleaves the complementary strand, while the RuvC domain cleaves the noncomplementary strand.
The double-stranded endonuclease activity of Cas9 also requires that a short conserved sequence, (2–5 nts) known as protospacer-associated motif (PAM), follows immediately 3´- of the crRNA complementary sequence (15). In fact, even fully complementary sequences are ignored by Cas9-RNA in the absence of a PAM sequence (16).
Cas9 and CRISPR as a New Tool in Molecular Biology
The simplicity of the type II CRISPR nuclease, with only three required components (Cas9 along with the crRNA and trRNA) makes this system amenable to adaptation for genome editing. This potential was realized in 2012 by the Doudna and Charpentier labs (11). Based on the type II CRISPR system described previously, the authors developed a simplified two-component system by combining trRNA and crRNA into a single synthetic single guide RNA (sgRNA). sgRNAprogrammed Cas9 was shown to be as effective as Cas9 programmed with separate trRNA and crRNA in guiding targeted gene alterations (Figure 2A).
To date, three different variants of the Cas9 nuclease have been adopted in genome-editing protocols. The first is wild-type Cas9, which can site-specifically cleave double-stranded DNA, resulting in the activation of the doublestrand break (DSB) repair machinery. DSBs can be repaired by the cellular Non-Homologous End Joining (NHEJ) pathway (17), resulting in insertions and/or deletions (indels) which disrupt the targeted locus. Alternatively, if a donor template with homology to the targeted locus is supplied, the DSB may be repaired by the homology-directed repair (HDR) pathway allowing for precise replacement mutations to be made (Figure 2A) (17, 18).
Cong and colleagues (1) took the Cas9 system a step further towards increased precision by developing a mutant form, known as Cas9D10A, with only nickase activity. This means it cleaves only one DNA strand, and does not activate NHEJ. Instead, when provided with a homologous repair template, DNA repairs are conducted via the high-fidelity HDR pathway only, resulting in reduced indel mutations (1, 11, 19). Cas9D10A is even more appealing in terms of target specificity when loci are targeted by paired Cas9 complexes designed to generate adjacent DNA nicks (20) (see further details about “paired nickases” in Figure 2B).
The third variant is a nuclease-deficient Cas9 (dCas9, Figure 2C) (21). Mutations H840A in the HNH domain and D10A in the RuvC domain inactivate cleavage activity, but do not prevent DNA binding (11, 22). Therefore, this variant can be used to sequence-specifically target any region of the genome without cleavage. Instead, by fusing with various effector domains, dCas9 can be used either as a gene silencing or activation tool (21, 23–26). Furthermore, it can be used as a visualization tool. For instance, Chen and colleagues used dCas9 fused to Enhanced Green Fluorescent Protein (EGFP) to visualize repetitive DNA sequences with a single sgRNA or nonrepetitive loci using multiple sgRNAs (27).
Targeting Efficiency and Off-target Mutations
Targeting efficiency, or the percentage of desired mutation achieved, is one of the most important parameters by which to assess a genome-editing tool. The targeting efficiency of Cas9 compares favorably with more established methods, such as TALENs or ZFNs (8). For example, in human cells, custom-designed ZFNs and TALENs could only achieve efficiencies ranging from 1% to 50% (29–31). In contrast, the Cas9 system has been reported to have efficiencies up to >70% in zebrafish (32) and plants (33), and ranging from 2–5% in induced pluripotent stem cells (34). In addition, Zhou and colleagues were able to improve genome targeting up to 78% in one-cell mouse embryos, and achieved effective germline transmission through the use of dual sgRNAs to simultaneously target an individual gene (35).
A widely used method to identify mutations is the T7 Endonuclease I mutation detection assay (36, 37) (Figure 3). This assay detects heteroduplex DNA that results from the annealing of a DNA strand, including desired mutations, with a wildtype DNA strand (37).
Another important parameter is the incidence of off-target mutations. Such mutations are likely to appear in sites that have differences of only a few nucleotides compared to the original sequence, as long as they are adjacent to a PAM sequence. This occurs as Cas9 can tolerate up to 5 base mismatches within the protospacer region (36) or a single base difference in the PAM sequence (38). Off-target mutations are generally more difficult to detect, requiring whole-genome sequencing to rule them out completely.
Recent improvements to the CRISPR system for reducing off-target mutations have been made through the use of truncated gRNA (truncated within the crRNA-derived sequence) or by adding two extra guanine (G) nucleotides to the 5´ end (28, 37). Another way researchers have attempted to minimize off-target effects is with the use of “paired nickases” (20). This strategy uses D10A Cas9 and two sgRNAs complementary to the adjacent area on opposite strands of the target site (Figure 2B). While this induces DSBs in the target DNA, it is expected to create only single nicks in off-target locations and, therefore, result in minimal off-target mutations.
By leveraging computation to reduce off-target mutations, several groups have developed webbased tools to facilitate the identification of potential CRISPR target sites and assess their potential for off-target cleavage. Examples include the CRISPR Design Tool (38) and the ZiFiT Targeter, Version 4.2 (39, 40).
Applications as a Genome-editing and Genome Targeting Tool
Following its initial demonstration in 2012 (9), the CRISPR/Cas9 system has been widely adopted. This has already been successfully used to target important genes in many cell lines and organisms, including human (34), bacteria (41), zebrafish (32), C. elegans (42), plants (34), Xenopus tropicalis (43), yeast (44), Drosophila (45), monkeys (46), rabbits (47), pigs (42), rats (48) and mice (49). Several groups have now taken advantage of this method to introduce single point mutations (deletions or insertions) in a particular target gene, via a single gRNA (14, 21, 29). Using a pair of gRNA-directed Cas9 nucleases instead, it is also possible to induce large deletions or genomic rearrangements, such as inversions or translocations (50). A recent exciting development is the use of the dCas9 version of the CRISPR/Cas9 system to target protein domains for transcriptional regulation (26, 51, 52), epigenetic modification (25), and microscopic visualization of specific genome loci (27).
The CRISPR/Cas9 system requires only the redesign of the crRNA to change target specificity. This contrasts with other genome editing tools, including zinc finger and TALENs, where redesign of the protein-DNA interface is required. Furthermore, CRISPR/Cas9 enables rapid genome-wide interrogation of gene function by generating large gRNA libraries (51, 53) for genomic screening.
The future of CRISPR/Cas9
The rapid progress in developing Cas9 into a set of tools for cell and molecular biology research has been remarkable, likely due to the simplicity, high efficiency and versatility of the system. Of the designer nuclease systems currently available for precision genome engineering, the CRISPR/Cas system is by far the most user friendly. It is now also clear that Cas9’s potential reaches beyond DNA cleavage, and its usefulness for genome locus-specific recruitment of proteins will likely only be limited by our imagination.
- Cong L., et al. (2013) Science, 339, 819–823.
- Capecchi, M.R. (2005) Nat. Rev. Genet. 6, 507–512.
- Fire, A., et al. (1998) Nature, 391, 806–811.
- Elbashir, S.Mm, et al. (2002) Methods, 26, 199–213.
- Martinez, J., et al. (2003) Nucleic Acids Res. Suppl. 333.
- Alic, N, et al. (2012) PLoS One, 7, e45367.
- Miller, J., et al. (2005) Mol. Ther. 11, S35–S35.
- Mussolino, C., et al. (2011) Nucleic Acids Res. 39, 9283–9293.
- Ishino, Y., et al. (1987) J. Bacteriol. 169, 5429–5433.
- Barrangou, R., et al. (2007). Science, 315, 1709–1712.
- Jinek, M., et al. (2012) Science, 337, 816–821.
- Deltcheva, E., et al. (2011) Nature, 471, 602–607.
- Sapranauskas, R., et al. (2011) Nucleic Acids Res. 39, 9275–9282.
- Nishimasu, H., et al. (2014) Cell, doi:10.1016/j.cell.2014.02.001
- Swarts, D.C., et al. (2012) PLoS One, 7:e35888.
- Sternberg, S.H., et al. (2014) Nature, doi:10.1038/nature13011.
- Overballe-Petersen, S., et al. (2013) Proc. Natl. Acad. Sci. U.S.A. 110,19860–19865.
- Gong, C., et al. (2005) Nat. Struct. Mol. Biol. 12, 304–312.
- Davis, L., Maizels, N. (2014) Proc. Natl. Acad. Sci. U S A, 111, E924–932.
- Ran, F.A., et al. (2013) Cell, 154, 1380–1389.
- Qi, L.S., et al. (2013) Cell, 152, 1173–1183.
- Gasiunas, G., et al. (2012) Proc. Natl. Acad. Sci. U S A, 109, E2579–2586.
- Maede, M.L., et al. (2013) Nat. Methods, 10, 977–979.
- Gilbert, L.A., et al. (2013) Cell, 154, 442–451.
- Hu, J., et al. (2014) Nucleic Acids Res. doi:10.1093/nar/gku109.
- Perez-Pinera, P., et al. (2013) Nat. Methods, 10, 239–242.
- Chen, B., et al. (2013) Cell, 155, 1479–1491.
- Seung, W., et al. (2014) Genome Res. 24, 132–141.
- Miller, J.C., et al. (2011). Nat. Biotechnol. 29, 143–148.
- Mussolino, C., et al. (2011). Nucleic Acids Res. 39, 9283–9293.
- Maeder, M.L., et al. (2008) Mol. Cell, 31, 294–301.
- Hwang, W.Y., et al. (2013) PLoS One, 8:e68708.
- Feng, Z., et al. (2013) Cell Res. 23, 1229–1232.
- Mali, P., et al. (2013) Science, 339, 823–826.
- Zhou, J., et al. (2014) FEBS J. doi:10.1111/febs.12735.
- Fu, Y., et al. (2013) Nat. Biotechnol. 31, 822–826.
- Fu, Y., et al. (2014) Nat Biotechnol. doi: 10.1038/nbt.2808.
- Hsu, P.D., et al. (2013) Nat. Biotechnol. 31, 827–832.
- Sander, J.D., et al. (2007) Nucleic Acids Res. 35, W599-605.
- Sander, J.D., et al (2010) Nucleic Acids Res. 38, W462–468.
- Fabre, L., et al. (2014) PLoS Negl. Trop. Dis. 8:e2671.
- Hai, T., et al. (2014) Cell Res. doi: 10.1038/cr.2014.11.
- Guo, X., et al. (2014) Development, 141, 707–714.
- DiCarlo, J.E., et al. (2013) Nucleic Acids Res. 41, 4336–4343.
- Gratz, S.J., et al. (2014) Genetics, doi:10.1534/genetics.113.160713.
- Niu, Y., et al. (2014) Cell, 156, 836–843.
- Yang, D., et al. (2014) J. Mol. Cell Biol. 6, 97-99.
- Ma, Y., et al. (2014) Cell Res. 24, 122–125.
- Mashiko, D., et al. (2014) Dev. Growth Differ. 56, 122–129.
- Gratz, S.J., et al. (2013) Fly, 249.
- Mali, P., et al. (2013) Nat. Biotechnol. 31, 833–838.
- Cheng, A.W., et al. (2013) Cell Res. 23, 1163–1171.
- Koike-Yusa, H., et al. (2013) Nat. Biotechnol. doi: 10.1038/nbt.2800.
- Sander, J.D., and Joung, J.K. (2014) Nat Biotechnol. doi:10.1038/nbt.2842.
From NEB expressions Issue I, 2014
Article by Alex Reis, Ph.D., Bitesize Bio
Breton Hornblower, Ph.D., Brett Robb, Ph.D. and George Tzertzinis, Ph.D., New England