A newly synthesized RNA molecule is called a pri-mary transcript. The most extensive process-ing of primary transcripts occurs in eukaryotic mRNAs and in tRNAs of both bacteria and eukaryotes. The primary transcript for a eukaryotic mRNA typ-ically contains sequences encompassing one gene, al-though the sequences encoding the polypeptide may not be contiguous. Noncoding tracts that break up the coding region of the transcript are called introns, and the coding segments are called exons. In a process called splicing,the introns are removed from the pri-mary transcript and the exons are joined to form a con-tinuous sequence that specifies a functional polypep-tide. Eukaryotic mRNAs are also modified at each end. A modified residue called a 5cap (p. 1008) is added at the 5 end. The 3 end is cleaved, and 80 to 250 A residues are added to create a poly(A) “tail.” The some-times elaborate protein complexes that carry out each of these three mRNA-processing reactions do not oper-ate independently. They appear to be organized in as-sociation with each other and with the phosphorylated CTD of Pol II; each complex affects the function of the others.
The 5′ cap (red) is added before synthesis of the primary transcript is complete. A non coding sequence following the last exon is shown in orange. Splicing can occur either before or after the cleavage and polyadenylation steps. All the processes shown here take place within the nucleus.
Eukaryotic mRNAs Are Capped at the 5’End
Most eukaryotic mRNAs have a 5′ cap, a residue of 7-methylguanosine linked to the 5′-terminal residue of the mRNA through an unusual 5′, 5’-triphosphate linkage. The 5′ cap helps protect mRNA from ribonucleases. The cap also binds to a specific cap-binding complex of proteins and participates in binding of the mRNA to the ribosome to initiate translation
The 5cap is formed by condensation of a molecule of GTP with the triphosphate at the 5’end of the tran-script. The guanine is subsequently methylated at N-7, and additional methyl groups are often added at the 2′ hydroxyls of the first and second nucleotides adjacent to the cap The methyl groups are derived from S-adenosylmethionine. All these reactions occur very early in transcription, after the first 20 to 30 nu-cleotides of the transcript have been added.
All three of the capping enzymes, and through them the 5′ end of the transcript itself, are associated with the RNA poly-merase II CTD until the cap is synthesized. The capped 5’end is then released from the capping enzymes and bound by the cap-binding complex.
RNA Catalyzes the Splicing of Introns
There are four classes of introns. The first two, the group I and group II introns, differ in the details of their splicing mechanisms but share one surprising charac-teristic: they are self-splicing—no protein enzymes are involved. Group I introns are found in some nuclear, mi-tochondrial, and chloroplast genes coding for rRNAs, mRNAs, and tRNAs. Group II introns are generally found in the Primary transcripts of mitochondrial or chloro-plast mRNAs in fungi, algae, and plants. Group I and group II introns are also found among the rarer exam-ples of introns in bacteria. Neither class requires a high-energy cofactor (such as ATP) for splicing. The splicing mechanisms in both groups involve two transesterifica-tion reaction steps
The group I splicing reaction requires a guanine nucleoside or nucleotide cofactor, but the cofactor is not used as a source of energy; instead, the 3′-hydroxyl group of guanosine is used as a nucleophile in the first step of the splicing pathway. The guanosine 3′-hydroxyl group forms a normal 3′,5′-phosphodiester bond with the 5′ end of the intron. The 3′ hydroxyl of the exon that is displaced in this step then acts as a nucleophile in a similar reaction at the 3’end of the intron. The result is precise excision of the intron and ligation of the exons.
In group II introns the reaction pattern is similar ex-cept for the nucleophile in the first step, which in this case is the 2-hydroxyl group of an A residue within the intron (Fig. 26–15). A branched lariat structure is formed as an intermediate.
Most introns are notself-splicing, and these types are not designatede with a group number. The third and largest class of introns includes those found in nuclear mRNA primary transcripts. These are called spliceo-somal introns, because their removal occurs within and is catalyzed by a large protein complex called a spliceosome. Within the spliceosome, the introns undergo splicing by the same lariat-forming mechanism as the group II introns. The spliceosome is made up of spe-cialized RNA-protein complexes, small nuclear ribonucleoproteins (snRNPs, often pronounced “snurps”). Each snRNP contains one of a class of eukaryotic RNAs, 100 to 200 nucleotides long, known as small nuclear RNAs (snRNAs). Five snRNAs (U1, U2, U4, U5, and U6) involved in splicing reactions are generally found in abundance in eukaryotic nuclei. The RNAs and proteins in snRNPs are highly conserved in eukaryotes from yeasts to humans. Spliceosomal introns generally have the dinu-cleotide sequence GU and AG at the 5’and 3’ends, respectively and these sequences mark the sites where splicing occurs. The U1 snRNA contains a sequence complementary to sequences near the 5′ splice site of nuclear mRNA introns and the U1 snRNP binds to this region in the primary transcript. Addition of the U2, U4, U5, and U6 snRNPs leads to formation of the spliceosome. The snRNPs together contribute five RNAs and about 50 proteins to the spliceosome, a supramolecular assembly nearly as complex as the ribosome. ATP is required for assembly of the spliceosome, but the RNA cleavage-ligation reactions do not seem to require ATP. Some mRNA introns are spliced by a less common type of spliceosome, in which the U1 and U2 snRNPs are re-placed by the U11 and U12 snRNPs. Whereas U1- and U2-containing spliceosomes remove introns with (5′) GU and AG(3′) terminal sequences whereas the U11- and U12-containing spliceosomes remove a rare class of introns that have (5′)AU and AC(3′) terminal sequences to mark the intronic splice sites. The spliceosomes used in nuclear RNA splicing may have evolved from more ancient group II introns, with the snRNPs replacing the catalytic domains of their self-splicing ancestors.
The U1 snRNA has a sequence near its 5′ end that is complementary to the splice site at the 5′ end of the intron. Base pairing of U1 to this region of the primary transcript helps define the 5′ splice site during spliceosome assembly. U2 is paired to the intron at a position encompassing the A residue that becomes the nucleophile during the splicing reaction. Base pairing of U2 snRNA causes a bulge that displaces and helps to activate the adenylate, whose 2′ OH will form the lariat structure through a 2′,5′-posphodiester bond. The U1 and U2 snRNPs bind, then the remaining snRNPs (the U4/U6 complex and U5) bind to form an inactive spliceosome. Internal rearrangements convert this species to an active spliceosome in which U1 and U4 have been expelled and U6 is paired with both the 5′ splice site and U2. This is followed by the catalytic steps, which parallel those of the splicing of group II introns.
The fourth class of introns, found in certain tRNAs, is distinguished from the group I and II introns in that the splicing reaction requires ATP and an endonuclease. The splicing endonuclease cleaves the phosphodiester bonds at both ends of the intron, and the two exons are joined by a mechanism similar to the DNA ligase reaction.
Mechanism of the DNA ligase reaction. In each of the three steps, one phosphodiester bond is formed at the expense of another. Steps 1 and 2 lead to a ctivation of the phosphate in the nick. An AMP group is transferred first to a Lys residue on the enzyme and then to the phosphate in the nick. In step 3 , the 3-hydroxyl group attacks this phosphate and displaces AMP, producing a phosphodiester bond to seal the nick. In the E. coli DNA ligase reaction, AMP is derived from NAD. The DNA ligases isolated from a number of viral and eukaryotic sources use ATP rather than NAD and they release pyrophosphate rather than nicotinamide mononu-cleotide (NMN) in step
Eukaryotic mRNAs have a Distinctive 3’End Structure
At their 3′ end, most eukaryotic mRNAs have a string of 80 to 250 A residues, making up the poly(A) tail. This tail serves as a binding site for one or more specific proteins. The poly (A) tail and its associated pro-teins probably help protect mRNA from enzymatic de-struction. Many prokaryotic mRNAs also acquire poly (A) tails, but these tails stimulate decay of mRNA rather than protecting it from degradation.
Pol II synthesizes RNA beyond the segment of the transcript containing the cleavage signal sequences, including the highly conserved upstream sequence (5′) AAUAAA. The cleavage signal sequence is bound by an enzyme complex that includes an endonuclease, a polyadenylate polymerase, and several other multisub-unit proteins involved in sequence recognition, stimulation of cleav-age, and regulation of the length of the poly (A) tail. 2 The RNA is cleaved by the endonuclease at a point 10 to 30 nucleotides 3 to (downstream of) the sequence AAUAAA. 3 The polyadenylate poly-merase synthesizes a poly(A) tail 80 to 250 nucleotides long, begin-ning at the cleavage site.
Overview of the processing of a eukaryotic mRNA. The ovalbumin gene, shown here, has introns A to G and exons 1 to 7 and L (L encodes a signal peptide sequence that targets the protein for export from the cell. About three-quarters of the RNA is removed during processing. Pol II extends the primary tran-script well beyond the cleavage and polyadenylation site (“extra RNA”) before terminating transcription. Termination signals for Pol II have not yet been defined.
Processing of pre-rRNA transcripts in bacteria
Before cleavage, the 30S RNA precursor is methylated at specific bases. The cleavage liberates precursors of rRNAs and tRNA(s). Cleavage at the points labeled 1, 2, and 3 is carried out by the enzymes RNase III, RNase P, and RNase E, respectively. RNase P is a ribozyme. The final 16S, 23S, and 5S rRNA products result from the action of a variety of specific nucleases. The seven copies of the gene for pre-rRNA in the E. coli chromosome differ in the number, location, and identity of tRNAs included in the primary transcript. Some copies of the gene have additional tRNA gene segments between the 16S and 23S rRNA segments and at the far 3′ end of the primary transcript.
Processing of tRNAs in bacteria and eukaryotes
The yeast tRNATyr(the tRNA specific for tyrosine binding) is used to illustrate the important steps. The nucleotide sequences shown in yellow are removed from the primary transcript. The ends are processed first, the 5′ end before the 3′ end. CCA is then added to the 3′ end, a necessary step in processing eukaryotic tRNAs and those bacterial tRNAs that lack this sequence in the primary transcript. While the ends are being processed, specific bases in the rest of the transcript are modified. For the eukaryotic tRNA shown here, the final step is splicing of the 14-nucleotide intron. In-trons are found in some eukaryotic tRNAs but not in bacterial tRNAs