History of DNA Sequencing Methods and Current Status
What is sequencing?
History of DNA Sequencing Methods and Current Status. The first method for determining DNA sequences involved a location-specific primer extension strategy established by Ray Wu at Cornell University in 1970. DNA polymerase catalysis and specific nucleotide labeling, both of which figure prominently in current sequencing schemes, were used to sequence the cohesive ends of lambda phage DNA.
Between 1970 and 1973, Wu, R Padmanabhan and colleagues demonstrated that this method can be employed to determine any DNA sequence using synthetic location-specific primers.Frederick Sanger then adopted this primer-extension strategy to develop more rapid DNA sequencing methods at the MRC Centre, Cambridge, UK and published a method for “DNA sequencing with chain-terminating inhibitors” in 1977.Walter Gilbert and Allan Maxam at Harvard also developed sequencing methods, including one for “DNA sequencing by chemical degradation”. In 1973, Gilbert and Maxam reported the sequence of 24 basepairs using a method known as wandering-spot analysis. Advancements in sequencing were aided by the concurrent development of recombinant DNA technology, allowing DNA samples to be isolated from sources other than viruses.
Early DNA Sequencing Methods:-
1. whole transcriptomes shotguns sequencing:
RNA Sequencing, Which is also known as whole transcriptomes shotguns sequencing. It is one of the earliest forms of nucleotide sequencing. As technology became more advanced, biologists have been able to study more complex larger genomes. Some scientists came to the conclusion that more data can be obtained by sequencing both ends of a fragment of DNA. This process was a longer and more complicated, but information gathered was invaluable.
The sequencing of full genomes begins in 1984 by the Medical Research Council. The team of scientists deciphered a complete sequence of the Epstein-bar virus. The researchers came to a conclusion that the virus obtained 172,282 individual nucleotides. After this experiment had concluded, this was the start of a new age of the field of DNA sequencing. The best part of this investigation was it was the first of its kind because there was no relevant data on genetics sequencing.
In the early 1980s, a non-radioactive method of swapping DNA molecules was developed by rearranging sequencing reaction mixtures during electrophoresis. This lead to the creation of immobilizing matrix technology and the commercialization of DNA sequencing technology by various biotech companies. With the marketing of DNA sequencing technology, many countries started their genome sequencing programs. As the years went by, more and more genomes became completely mapped. Before you know it, the entire process of DNA sequencing became fully automated.
2. HTP Methods:
In the mid-1990s, new methods of genome sequencing were developed and implemented into the professional DNA sequencing machines. These new approaches are what’s known as the next generation sequencing methods. This process re-sequences a genome of an individual species by comparing it to another genome of the same species and searching for the variables between them. Like with anything that is commercialized, researchers had a high demand for cost-effective sequencing technology has led to the advancements over eight high thorough-put sequencing methods.
HTP sequencing is extremely cost effective because it can run as many as 500,000 sequences at once. By breaking each genome sequence down, fragments of DNA can be read, interpreted, and compared to hundreds of thousands of genomes from members of the same species. These techniques combined with the brightest minds in the fields of biology and chemistry have led to the creation of the human genome project. This was an international research project with the objective of identifying, mapping, and sequencing all of the genes in the human genome. The project was complete in 2003 after 13 years of continuous research.
If you take a step back and appreciate all of the brainpower, passion, and determination that went into these projects, you can begin to see how incredible nature is. Each and every one of us has our own, unique DNA that has been carried on throughout the dawn of humanity. DNA sequencing has allowed us to understand and revel in the beauty of who we are down to the lowest common denominator and you can truly appreciate how precious life is.
3. Wandering Spot Analysis:
Sanger et al. (1973) gave this method. It comprises of four steps. First of all DNA is labelled with 32P at the 5′ end of the molecule. Then fragment is partially digested with snake venom phosphodiesterase, which sequentially removes nucleotides from the 3′ end. For example, DNA piece 5′-*ACTTAG-3′ upon digestion would give five fragments — 5′-*ACTTA-3′, 5′-*ACTT-3′, 5′-*ACT-3′, 5′-*AC-3′, and 5′-*A-3′. All these fragments will be containing radioactive label. Next, these fragments are subjected to electrophoresis, then the strip that results from the electrophoresis is applied to a sheet of chromatography medium and homochromatography is performed. Homochromatography is a process in which a solvent moves up the chromatography medium separating the labeled oligonucleotide fragments according to length. Shorter the fragment, faster the migration. The resulting two-dimensionally separated nucleotides are then autoradiographed. Dark spots indicate the location of radioactive nucleotide fragments when X-ray film is developed. Finally, a sequence is obtained as shown in this Fig.
Partial digest of 5′ 32P-ACTTAG is applied to cellulose acetate strip at pH 3.5 buffer. Electric field is applied across the strip. Nucleotides G and T tend to have negative charges at this pH with T>G; A and C tend to have positive charges with C>A. Fragments shorter by virtue of loss of T will move more slowly towards the anode pole since they lost some of their attracting cathode charge and so on. Result is that loss of T or G causes a mobility shift to the left, while loss of A or C causes a mobility shift to the right. Strip from above steps is attached to a chromatogram. In homochromatography with added unlabelled nucleotide fragments of various lengths, all the four bases are equal in migration speed. Longer the nucleotide fragment, slower the migration. Fragments are found by means of autoradio-graphic detection of locations of radioactivity in form of dark spot. A line is drawn connecting the longest spot (closest to bottom of the sheet) to the next longest, and so on. Numbers 1-5 on the line indicate the increasing number of nucleotides removed from original fragment. Line 1 which indicates loss of G from 3′ end is quite different in length from 3 and 4 which indicates loss of T from 3′ end. Also, line 2, which indicates loss of an A from 3′ end, causes a small shift in the opposite direction, and line 5, which indicates loss of a C from 3′ end, results in a larger shift in the same direction as loss of A. Wandering spot analysis was applied by Khorana and his co-workers to obtain the sequences of the borders of an E. coli tRNAtyr suppressor mutant gene.
DNA can be sequenced by a chemical procedure that breaks a terminally labelled DNA molecule partially at each repetition of a base. The lengths of the labelled fragments then identify the positions of that base. Maxam and Gilbert (1977) described reactions that cleave DNA preferentially at guanine, at adenine, at cytosine and thymine equally, and cytosines alone. When the products of these reactions are resolved by size, by electrophoresis on a polyacrylamide gel, the DNA sequence can be read from the pattern of radioactive bands. The technique permits sequencing of at least 100 bases from the point of labeling.
In this method, many uniform repeated length of DNA are isolated through restriction endonuclease digests. Copies of the DNA fragments are labelled at their 3′ or 5′ end with radioactive 32P. These fragments are then broken at various points by four separate chemical treatments, each treatment removing on an average either one purine or one pyrimidine from any particular chain. The result of each such break in the DNA is to generate a 32P-labelled fragment of a specific length that bands at a specific position on a gel subjected to an electric gradient. This process of gel electrophoresis can separate molecules that differ by one or a single nucleotide length, and the 32P-labelled DNA fragments on the gel can then be detected by autoradiography. The radioactive DNA molecules are then divided into four groups. These four groups are treated as shown in Table 3.1. Fig. 3.2 illustrates sequencing of 11-nucleotide long DNA fragments that has been labelled at its 5′ end with 32P.
Breakage of different DNA nucleotides by giving chemical treatments.
|Methylation||breaks DNA at G|
|Acid (pH 2.0)||breaks DNA at A and G|
|Hydrazine||breaks DNA at T and C|
|Hydrazine in high salt||breaks DNA at C|
These treatments are adjusted in such a way that on an average only one nucleotide is removed from each DNA strand. A particular treatment will generate fragments of varying lengths, depending upon which nucleotide was removed. Products of the four different treatments are placed in parallel lanes on a polyacrylamide gel at the negative pole of an electrophoretic apparatus. Pore size of the gel governs the mobility of DNA molecules. Smaller the nucleotide fragment, faster is the migration rate towards the positive pole. After electrophoresis, gel is covered with a sensitized film and an autoradiograph is taken. From the banding pattern, nucleotide sequence of the DNA is then read in ascending order from the 5′ terminal end at the positive pole.
Sanger et al. (1977) developed a different sequencing method. This method is similar to “plus or minus” method. of Sanger and Coulson (1975). But makes use of 2¢,3¢-dideoxy and arabinonucleoside analogue of the normal deoxynucleoside tiphosphates, which act as specific inhibitors of DNA polymerase. They determined complete sequence of fX174 DNA of approximately 5,375 nucleotides which codes for 9 proteins in this virus. They used the rapid and simple “plus and minus” method The scheme used comprises of the following steps. DNA polymerase-generated single stranded oligonucleotides are electrophoresed on polyacrylamide gels. Conditions are arranged such that the nucleotide sequences can be read directly from an autoradiograph of the electrophoresed polyacrylamide gel. Primer is
elongated at 3′, the end towards the region of bases ATGCTG using one nucleotide radioactively labelled with 32P. This comprises plus series of four experiments. Synthesis of new strand is halted at various points by insertion of dideoxy analogues. Dideoxy analogues lack 3′-OH group of deoxyribose sugar (Fig. 3.3). In the minus series of experiments, the oligonucleotide primer is elongated using only three of the deoxynucleoside triphosphates (e.g., dGTP, dCTP, dTTP are added, but dATP is missing). Each primer will be elongated until the missing nucleotide is specified by the template. The termination point will be just before dA in the experiment shown. A minus experiment, leaving out each of the four nucleotides, is performed. DNA products of each of the four minus experiments are denatured and electrophoresed on a polyacrylamide gel. Then, autoradiography is used to locate the radioactive DNA bands. The shortest fragments move the fastest in this analysis. The sequence 3′-TACGAC-5′ can be read directly from the left group of four minus lanes in the autoradiograph. The sequence is simultaneously confirmed using the plus series of experiments. The T4 DNA polymerase will degrade double-stranded DNA from its 3′ end.
Enzymatic method of DNA sequencing.
If one type of deoxy-nucleoside triphosphate (for example, dATP) is added during this process process, the exonuclease action of the T4 DNA polymerase stops at the nucleotides containing the same base as the free nucleotide (at nucleotides with the base A. The minus series stops just before an A, whereas the plus series degrades until an A is reached.
Current Status: Modern DNA Sequencing Methods:
New and important issues emerge with the development such as questions related to public safety and health and biotechnical question. The best known DNA sequencing techniques are:
1.The Sanger Method:-
Fred Sanger developed the DNA method that forms the basis of automated cycle-sequencing reactions. The major drawback to cycle-sequencing was its preference of a particular enzyme. Substituting an amino acid in the primary sequence changed the effect and equalized the rate of incorporation. Polymerase Chain Reaction (PCR) enzymes require a feature called proof-reading that enables the enzyme to correct mistakes during incorportation.
Suppression of the activity is necessary to avoid un-interpretable data. Improvements still allowed significant peak intensity variation for fluorescently labelled dye-terminators. The termination pattern was predictable and reproducible. The variation made base calling automation difficult. A modified set of fluorescent labels made the signal more even and base calling automation improved significantly.
A strategy for sequencing large scale projects involved amplification, purification, and selection of the template. The main innovation in amplification was using PCR. Purification innovation included using commercially available agarose. The demand for a sequencing method capable of providing a long read-length, short analysis time, high accuracy, and low cost led to modifications of the Sanger method.
2.Maxim & Gilbert Method:-
The method allows breaking an end labelled DNA strand at specific bases with the use of specific conditions and reagents. The methodology of PCR product sequencing involves the use of a template generated by PCR use of a non-biotinylated forward and reverse primers. Dye-terminator cycle-sequencing of the non-purified product uses the same primers as those used for the PCR.
The method was convenient as a delimiter for direct PCR sequencing. Substituting chain-delimiters for chain-terminators was a heat-stable method that allowed PCR incorporation into DNA. After the incorporation, they blocked exonucleases action. The positions revealed by exonuclease digestion generates a series of fragments having borane at the end. A standard sequencing reaction separates fragments by gel electrophoresis.
The enzymic strategy resulted in increased band separation resolution and band sharpness on autoradiography. As an alternative to the use of radioisotopes, lab technicians use the biotin-streptaviden system based on chemiluminescene detection. In this system, an oligonucleotide linked to biotin is the primer for the sequencing reaction.
There are two approaches to the enzymic sequencing used as the tool for genomic research. They are shotgun and primer walking. Shotgun is a random approach. There is no control of the sequenced region. Geonomic DNA sequencing randomly fragments DNA into smaller pieces by using scission methods such as nebulization and sonication.
The process is highly automated. The automation includes cloning vectors, colony selection, and called bases. Fragments, inserted into a vector, replicate in a bacterial culture. DNA is extensively sequenced by the selection of several positive amplifications.
Due to the randomness of the process, generated sequences overlap. Sequence assembly is the result of sequence alignment or overlapping. Shotgun sequencing typically produces a high redundancy level which affects the cost of the process. J. Craig Venter introduced a method of shotgunning that involved the whole genome at one time.
The strategy is enormously dependent on computational resources to align all the generated sequences. Haemophilus influenzae sequencing in 1995 and more recent human genome sequencing are rewards of Venter’s efforts. Shotgun is a well-established method of sequencing.
It is readily available with optimized cloning, universal primers labelled fluorescently, and software used for base calling and sequence assembly. The random approach produces gaps in the sequence that only direct sequencing is capable of completing.
The other approach for genomic sequencing, known as primer walking, directly sequences unknown DNA with sites that the sequence is known. The first sequence reaction uses primer that hybridize to vector sequences and polymerizes the strands complementary to the template.
Choosing a second priming site inside the new generated sequence follows the same direction as the original. The major advantage is reduced redundancy. To avoid mispriming, lab technicians use a single-stranded DNA-binding protein or stacking effect of select modular primers.
Today the cost-effective and time-saving method is not as appealing because of primer synthesis technology improvements.
Shotgun sequencing is a sequencing method designed for analysis of DNA sequences longer than 1000 base pairs, up to and including entire chromosomes. This method requires the target DNA to be broken into random fragments. After sequencing individual fragments, the sequences can be reassembled on the basis of their overlapping regions.
Another method for in vitro clonal amplification is bridge PCR, in which fragments are amplified upon primers attached to a solid surfacex and form “DNA colonies” or “DNA clusters”. This method is used in the Illumina Genome Analyzer sequencers. Single-molecule methods, such as that developed by Stephen Quake’s laboratory (later commercialized by Helicos) are an exception: they use bright fluorophores and laser excitation to detect base addition events from individual DNA molecules fixed to a surface, eliminating the need for molecular amplification.
5.Massively parallel signature sequencing (MPSS):-
The first of the high-throughput sequencing technologies, massively parallel signature sequencing (or MPSS), was developed in the 1990s at Lynx Therapeutics, a company founded in 1992 by Sydney Brenner and Sam Eletr. MPSS was a bead-based method that used a complex approach of adapter ligation followed by adapter decoding, reading the sequence in increments of four nucleotides. This method made it susceptible to sequence-specific bias or loss of specific sequences. Because the technology was so complex, MPSS was only performed ‘in-house’ by Lynx Therapeutics and no DNA sequencing machines were sold to independent laboratories. Lynx Therapeutics merged with Solexa (later acquired by Illumina) in 2004, leading to the development of sequencing-by-synthesis, a simpler approach acquired from Manteia Predictive Medicine, which rendered MPSS obsolete. However, the essential properties of the MPSS output were typical of later high-throughput data types, including hundreds of thousands of short DNA sequences. In the case of MPSS, these were typically used for sequencing cDNA for measurements of gene expression levels.
A parallelized version of pyrosequencing was developed by 454 Life Sciences, which has since been acquired by Roche Diagnostics. The method amplifies DNA inside water droplets in an oil solution (emulsion PCR), with each droplet containing a single DNA template attached to a single primer-coated bead that then forms a clonal colony. The sequencing machine contains many picoliter-volume wells each containing a single bead and sequencing enzymes. Pyrosequencing uses luciferase to generate light for detection of the individual nucleotides added to the nascent DNA, and the combined data are used to generate sequence reads. This technology provides intermediate read length and price per base compared to Sanger sequencing on one end and Solexa and SOLiD on the other.
7.Illumina (Solexa) sequencing:-
Solexa, now part of Illumina, was founded by Shankar Balasubramanian and David Klenerman in 1998, and developed a sequencing method based on reversible dye-terminators technology, and engineered polymerases.The reversible terminated chemistry concept was invented by Bruno Canard and Simon Sarfati at the Pasteur Institute in Paris.It was developed internally at Solexa by those named on the relevant patents. In 2004, Solexa acquired the company Manteia Predictive Medicine in order to gain a massively parallel sequencing technology invented in 1997 by Pascal Mayer and Laurent Farinelli. It is based on “DNA Clusters” or “DNA colonies”, which involves the clonal amplification of DNA on a surface. The cluster technology was co-acquired with Lynx Therapeutics of California. Solexa Ltd. later merged with Lynx to form Solexa Inc.
An Illumina HiSeq 2500 sequencer
In this method, DNA molecules and primers are first attached on a slide or flow cell and amplified with polymerase so that local clonal DNA colonies, later coined “DNA clusters”, are formed. To determine the sequence, four types of reversible terminator bases (RT-bases) are added and non-incorporated nucleotides are washed away. A camera takes images of the fluorescently labeled nucleotides. Then the dye, along with the terminal 3′ blocker, is chemically removed from the DNA, allowing for the next cycle to begin. Unlike pyrosequencing, the DNA chains are extended one nucleotide at a time and image acquisition can be performed at a delayed moment, allowing for very large arrays of DNA colonies to be captured by sequential images taken from a single camera.
An Illumina MiSeq sequencer
Decoupling the enzymatic reaction and the image capture allows for optimal throughput and theoretically unlimited sequencing capacity. With an optimal configuration, the ultimately reachable instrument throughput is thus dictated solely by the analog-to-digital conversion rate of the camera, multiplied by the number of cameras and divided by the number of pixels per DNA colony required for visualizing them optimally (approximately 10 pixels/colony). In 2012, with cameras operating at more than 10 MHz A/D conversion rates and available optics, fluidics and enzymatics, throughput can be multiples of 1 million nucleotides/second, corresponding roughly to 1 human genome equivalent at 1x coverage per hour per instrument, and 1 human genome re-sequenced (at approx. 30x) per day per instrument (equipped with a single camera).
8.Ion Torrent semiconductor sequencing:-
Ion Torrent Systems Inc. (now owned by Life Technologies) developed a system based on using standard sequencing chemistry, but with a novel, semiconductor-based detection system. This method of sequencing is based on the detection of hydrogen ions that are released during the polymerisation of DNA, as opposed to the optical methods used in other sequencing systems. A microwell containing a template DNA strand to be sequenced is flooded with a single type of nucleotide. If the introduced nucleotide is complementary to the leading template nucleotide it is incorporated into the growing complementary strand. This causes the release of a hydrogen ion that triggers a hypersensitive ion sensor, which indicates that a reaction has occurred. If homopolymer repeats are present in the template sequence, multiple nucleotides will be incorporated in a single cycle. This leads to a corresponding number of released hydrogens and a proportionally higher electronic signal.
9.DNA nanoball sequencing:-
DNA nanoball sequencing is a type of high throughput sequencing technology used to determine the entire genomic sequence of an organism. The company Complete Genomics uses this technology to sequence samples submitted by independent researchers. The method uses rolling circle replication to amplify small fragments of genomic DNA into DNA nanoballs. Unchained sequencing by ligation is then used to determine the nucleotide sequence.This method of DNA sequencing allows large numbers of DNA nanoballs to be sequenced per run and at low reagent costs compared to other high-throughput sequencing platforms. However, only short sequences of DNA are determined from each DNA nanoball which makes mapping the short reads to a reference genome difficult. This technology has been used for multiple genome sequencing projects and is scheduled to be used for more.
10.Single molecule real time (SMRT) sequencing:-
SMRT sequencing is based on the sequencing by synthesis approach. The DNA is synthesized in zero-mode wave-guides (ZMWs) – small well-like containers with the capturing tools located at the bottom of the well. The sequencing is performed with use of unmodified polymerase (attached to the ZMW bottom) and fluorescently labelled nucleotides flowing freely in the solution. The wells are constructed in a way that only the fluorescence occurring by the bottom of the well is detected. The fluorescent label is detached from the nucleotide upon its incorporation into the DNA strand, leaving an unmodified DNA strand. According to Pacific Biosciences (PacBio), the SMRT technology developer, this methodology allows detection of nucleotide modifications (such as cytosine methylation). This happens through the observation of polymerase kinetics. This approach allows reads of 20,000 nucleotides or more, with average read lengths of 5 kilobases. In 2015, Pacific Biosciences announced the launch of a new sequencing instrument called the Sequel System, with 1 million ZMWs compared to 150,000 ZMWs in the PacBio RS II instrument.SMRT sequencing is referred to as “third-generation” or “long-read” sequencing.
Please share this article if you find this information useful.
Spread the knowledge by clicking the buttons below.⇓