Codon bias in the ß-lactam producer Acremonium chrysogenum Kerstin Jekosch and Ulrich Kück, Lehrstuhl für Allgemeine und Molekulare Botanik, D-44780 Bochum, Germany
Fungal Genetics Newsletter 46:11-13
*Table 2 in the print version of this article was mis-formatted and the correct version is available here*
In this paper we compile the G/C-content, the codon bias, and consensus sequences for the translation initiation and the intron splicing sites from 19 nuclear genes of the major ß-lactam antibiotic producer Acremonium chrysogenum. Our data are compared with those from other filamentous fungi, such as Aspergillus nidulans, Neurospora crassa, and Sordaria macrospora.
Genes used for the present analysis are 1) cystathionine beta-synthase (Acc # E08842), 2) homoserine 0-acetyltransferase (Acc # E08840), 3) cystathionine gamma-lyase (Acc # E08276), 4) CPC biosynthesis related gamma gene (Acc # E06692), 5) CPC biosynthesis related alpha gene (Acc # E06691), 6) alkaline protease, alp (Acc # D00923), 7) glyceraldehyde-3-phosphate dehydrogenase (Acc # E03375), 8) actin (Acc # E03374), 9) phosphoglycerate kinase E (Acc # 03373), 10) beta isopropylmaleate dehydrogenase, leu2 (Acc # E01906), 11) orotidine 5´-phosphate decarboxylase, pyr4 (Acc # X15937), 12) -(L--aminoadipyl)-L-cysteinyl-D-valine synthetase (Acc # M33522), 13) isopenicillin N synthase (Acc # S39881), 14) desacetoxycephalosporin C synthetase/hydroxylase, pcbAB (Gutiérrez et al. 1991, J. Bacteriol. 173:2354-2365), 15) desacetylcephalosporin C acetyltransferase, cefG (Acc # M91649), 16) esterase C, estC (Japan patent: Matsuda et al. 1992, Hei 4-144688), 17) beta tubuline, ßtub (Acc # X72789), 18) transcriptional repressor CREA, creA (unpublished), 19) transcription factor CPCR1, cpcR1 (Acc # AJ132014). No. 12 to 16 and 18 to 19 are involved in ß-lactam biosynthesis. The modes of action of genes 4 and 5 are not yet known.
The average G/C content is 61.5 ± 2.3%, which is higher than in A. nidulans (about 50 %, Lloyd and Sharp 1991, Mol. Gen. Genet. 230:288-294), N. crassa (54.1%, Edelmann and Staben 1994, Exp. Mycol. 18:70-81), and S. macrospora (56.7 %, Pöggeler 1997, Fungal Genet. Newsl. 44:41-44).
Similar to other fungi we found a bias for codons with a C in the third position (Table 1). While the termination codon TAA is preferred in N. crassa and S. macrospora, there is no such preference in A. nidulans and only little bias towards this codon in A. chrysogenum. The six least used codons in A. chrysogenum are TTA (Leu), TGT (Cys), AGA (Arg), ATA (Ile), GTA (Val) and AAA (Lys). Compared with the other three filamentous fungi, the same bias for rarely used and preferred codons is found in A. chrysogenum. An exception is found when the codons for serine are considered. Obviously, there is no significant preference for any one of the six possible tripletts.
The cephalosporin C-biosynthetic genes show different preferences for individual codons which correlates with their rate of expression as concluded from our own data (unpublished). The highly expressed pcbC and cefEF genes show only 15 and 17% low usage codons, while genes with a low transcriptional level, such as pcbAB, cefG and estC, contain 27 to 53% low usage codons. This correlation of the occurrence of rare codons and expression has already been reported for other organisms, e.g. Saccharomyces cerevisiae, Escherichia coli, Drosophila melanogaster (Zhang et al. 1991, Gene 105:61-67), S. macrospora (Pöggeler 1997) and A. nidulans (Lloyd and Sharp 1991, Mol. Gen. Genet. 230:288-294) and was supposed to affect translation rates.
The consensus sequence of the translation initiation context in A. chrysogenum does not significantly differ from S. macrospora and N. crassa. It should be noted that the open reading frame of the pcbAB gene starts with an GTG. The translation startpoint of the estC gene has not been identified yet. Following the ATG start codon, the three fungi show a preference for GCN, coding for alanine.
The average intron length in A. chrysogenum is 95 nt and this agrees well with the average intron length in S. macrospora (88 nt) but contrasts with that in N. crassa (63 nt). The average distance between the branch site and the 3´-splice site is 14 nt, and it is between 8 and 30 nt in length. In S. macrospora and N. crassa this distance varies from 12 to 22 nt and 14 to 30 nt, respectively. Similar the intron donor and acceptor consensus sequences show no significant deviation from those of the other fungi (Table 2). One exception is the cefG gene, the only ß-lactam biosynthesis gene containing introns with striking differences in the intron regulatory sequences (5´ intron donor G^TAGGTA and C^GGTGAG, 3´ intron acceptor AAAGT^, TGCTA^).
Acknowledgments: This work was supported by a grant of Hoechst Marion Roussel (Frankfurt a.M., Germany).
Table 1: Analysis of codon usage on the basis of 19 nuclear gene sequences. The codon usage values are given as whole numbers (N), as percent usage determined for each amino acid (%), and as relative synonymous codon usage (RSCU).
Codon | AS | N | % | RSCU | Codon | AS | N | % | RSCU | |
GCT | Ala | 154 | 16 | 1.48 | CTC | Leu | 394 | 41 | 3.78 | |
GCC | 466 | 52 | 4.47 | CTA | 58 | 6 | 0.56 | |||
GCA | 101 | 11 | 0.97 | CTG | 313 | 32 | 3.00 | |||
GCG | 178 | 20 | 1.71 | AAA | Lys | 48 | 12 | 0.46 | ||
CGT | Arg | 71 | 12 | 0.68 | AAG | 362 | 88 | 3.47 | ||
CGC | 248 | 41 | 2.38 | ATG | Met | 187 | 100 | 1.79 | ||
CGA | 58 | 10 | 0.56 | TTT | Phe | 81 | 22 | 0.78 | ||
CGG | 122 | 20 | 1.17 | TTC | 280 | 78 | 2.69 | |||
AGA | 41 | 7 | 0.39 | CCT | Pro | 103 | 17 | 0.99 | ||
AGG | 109 | 18 | 1.05 | CCC | 252 | 41 | 2.42 | |||
AAT | Asn | 84 | 22 | 0.81 | CCA | 74 | 12 | 0.71 | ||
AAC | 301 | 78 | 2.89 | CCG | 186 | 30 | 1.78 | |||
GAT | Asp | 163 | 28 | 1.56 | TCT | Ser | 82 | 10 | 0.79 | |
GAC | 428 | 72 | 4.11 | TCC | 209 | 25 | 2.01 | |||
TGT | Cys | 21 | 18 | 0.20 | TCA | 86 | 10 | 0.83 | ||
TGC | 99 | 82 | 0.95 | TCG | 189 | 22 | 1.81 | |||
CAA | Gln | 69 | 18 | 0.66 | AGT | 75 | 9 | 0.72 | ||
CAG | 321 | 82 | 3.08 | AGC | 210 | 25 | 2.02 | |||
GAA | Glu | 99 | 17 | 0.95 | ACT | Thr | 80 | 12 | 0.77 | |
GAG | 493 | 83 | 4.73 | ACC | 283 | 44 | 2.72 | |||
GGT | Gly | 163 | 21 | 1.56 | ACA | 94 | 15 | 0.90 | ||
GGC | 414 | 53 | 3.97 | ACG | 190 | 29 | 1.82 | |||
GGA | 94 | 12 | 0.90 | TGG | Trp | 124 | 100 | 1.19 | ||
GGG | 113 | 14 | 1.08 | TAT | Tyr | 64 | 19 | 0.61 | ||
CAT | His | 89 | 28 | 0.85 | TAC | 259 | 81 | 2.49 | ||
CAC | 234 | 72 | 2.25 | GTT | Val | 101 | 14 | 0.97 | ||
ATT | Ile | 95 | 19 | 0.91 | GTC | 380 | 52 | 3.65 | ||
ATC | 363 | 72 | 3.48 | GTA | 48 | 7 | 0.46 | |||
ATA | 46 | 9 | 0.44 | GTG | 202 | 28 | 1.94 | |||
TTA | Leu | 9 | 1 | 0.09 | TAA | Term. | 8 | 42 | 2.54 | |
TTG | 82 | 8 | 0.79 | TAG | 5 | 26 | 1.57 | |||
CTT | 114 | 12 | 1.09 | TGA | 6 | 32 | 1.94 |
Table 2: A. chrysogenum (Ac) consensus sequences compared with those of Sordaria macrospora (Sm) and Neurospora crassa (Nc). Abbreviations: org. = organism, subscribed numbers indicate the percentage of occurence.
org. translation initiation context Ac C44 A39 A39 C67 N N C61 A61 A94 T100 G95 G53 C53 A33 C39 C39 Sm A38 C50 G50 C75 C50 N C63 A88 A63 A100 T100 G100 C50 G38 A38 T35 Nc A C N N N C A A A T G G C C consensus 5´intron donor Ac G41 ^ G95 T91 A64 A68 G91 T64 Sm G67 ^ G100 T100 A72 A61 G83 T72 Nc G ^ G T A A G T consensus branch site Ac A50 G64 C100 T100 A95 C64 C41 G45 Sm A56 A56 C100 T100 A94 C78 N G44 G33 Nc A A C C T A C G G A consensus 3´intron acceptor Ac A32 G36 C55 A91 G91 T32 A41 T27 Sm G33 A39 C56 A100 G100 A27 T39 T44 Nc A A T A G T T C
Return to the FGN 46 Table of Contents