In this section, information is gathered on each established nuclear gene and on chromosome features such as centromeres, tips,
and the nucleolus organizer. Entries are arranged alphabetically by symbol. Some categories of mutants are prefaced
by a generic entry that concerns the entire group (e.g., al, arg, het, mat, mus, rec). Synonyms are cross-referenced to the
preferred symbol. Abbreviations are defined in Table 1. Synonymous gene symbols are listed in Table 2.
Within each entry for a gene locus, the name follows the symbol. The linkage group is then indicated
and the arm is specified, if known. This is followed by linkage data relating the gene to other loci. If the gene has
been cloned and sequenced, accession numbers are given for the sequence. The pSV50 cosmid library number or other reference
number of clones containing the sequence may also be given. The phenotype is described in a subsequent paragraph, including such
attributes as enzyme deficiency (including EC number), dominance, interaction with other genes, fertility, and stability. The
described phenotype is that of mutant strains unless stated otherwise. Scoring information, technical applications, and
alternate names may be given.
At the time the previous compendium was prepared, a gene could not be recognized or mapped until a phenotypically recognizable mutant
or variant had been discovered. In contrast, genes can now be recognized and defined solely on the basis of DNA sequence, without
knowledge of the phenotype. Some of the genes discovered in this way have no obvious null phenotype. For many others, the
null phenotype is unknown. The gene name may then be based on the inferred gene product or the known function of orthologues
or homologues in other organisms. Making the name explicit in such a situation would ideally require addition of the
suffix "-like" (for example, ras-like), but for brevity the suffix is usually omitted, just as the suffix "-requirement" is
omitted, although it is implied, when auxotrophic mutants are named. Similarly, qualifying adjectives have often been omitted.
When the function of a gene is inferred from sequence homology, "structural gene" usually implies "putative structural gene."
Likewise, "encodes" is used in place of "inferred to encode." A gene symbol without a superscript is used either to designate
a locus or to symbolize a mutant allele at that locus. Which usage applies in a particular instance should be obvious from
the context.
In the paragraph on linkage, recombination values may be given that originated from more than one cross. Differences between these
values may be due to sampling error (sample size is usually less than 100). The differences may also reflect differences
in trans-acting rec genes, which are known to regulate recombination in specific local regions. Values given as percentages
or as numerical fractions indicate the frequency of reciprocal recombination per meiotic product, unless it is indicated
explicitly that the value is for reciprocal crossing-over events per tetrad. (The number of crossing-over events per
tetrad is double that of crossovers per random meiotic product for the same interval.)
When gene order is based on duplication coverage or on three-point crosses, the sequence of loci is often known with
confidence even though interval lengths are unknown or imprecisely known. Where linkage information is given
in the format, "Locus c is between a, b and d, e," the word "and" separates markers to the left of c from markers to the right.
RFLP data from two widely used crosses are tabulated in ref.
(1452)
and are reproduced here as Appendix 3. These come
either from 18 tetrads (the first cross) or from 18 random isolates (the second cross). Data from these crosses
have been extremely valuable for assigning cloned sequences to a linkage group and for indicating their approximate
location. The two RFLP-marked crosses were not intended to provide information for precise mapping. Because
of the small numbers, estimates of distance are imprecise and gene order is often uncertain.
Results from the first RFLP-scored cross are presented here as numbers of asci of the three types, PD (parental ditype),
NPD (nonparental ditype), and T (tetratype), using the format: T, NPD/total. (The total number is
PD + NPD + T.) Because each crossover event in an ascus recombines only two of the four chromatids, the frequency of
crossover meiotic products (on which map distance is based) is half the tetratype frequency when there are no NPDs.
NPDs are produced when four-strand double exchanges occur in an interval between markers. If NPDs are present,
the best estimate of crossover frequency in the interval is obtained by multiplying the number of NPDs by 6 and
adding this to the number of T's before dividing by 2. (The multiplication compensates for cryptic multiple crossovers
that result in T or PD asci rather than NPD.) A significant excess of PD asci over NPD asci constitutes proof
of linkage [see refs.
(645)
and
(1547)].
Results from the second RFLP cross, which involves random ascospore isolates, are given as a numerical fraction, crossover products/
total.
In deriving recombination estimates from the RFLP data of both crosses, we have omitted from the calculation any
NPD tetrads (cross 1) or double-crossover progeny (cross 2) in intervals that contain few or no single crossovers.
In making the omission, we assume that the apparent double crossovers are spurious, having resulted from experimental or
recording errors or from gene conversion, rather than from bona fide reciprocal crossing over. Such errors or conversions
appear to be abundant in the existing RFLP data.
Rearrangement breakpoints define loci that have often been crucial for determining gene order. Rearrangements have not
been given separate entries in the main section, however, but breakpoints that are well-defined are included in lists of
mapped loci and are shown on the genetic maps (Appendix 2). Most of the breakpoints listed are from insertional or
quasiterminal rearrangements, which have been widely used to establish gene order and to map chromosome tips and centromeres.
Rearrangements with breakpoints that are located less precisely are not included in the lists in Appendix 2 but are described
in Appendix 4, which provides full information on all characterized rearrangements, whether mapped precisely or mapped only
to linkage group.
Rearrangement symbols have been abbreviated in the locus entries, lists, and maps. For example, In(IR;IL)OY323 is shortened
to In(OY323). Multiple breakpoints in the same chromosome are distinguished using superscripts, for example, In(OY323)L
and In(OY323)R. Rearrangement symbols are written in full in refs.
(700)
and
(1578).
With a few exceptions, entries are not provided for sequences of unknown function ("anonymous" loci), even if they
have been RFLP-mapped, nor are these loci listed with the maps in Appendix 2. RFLP linkage data are given in
Appendix 3. Gene loci recognized on the basis of EST sequences are given in Appendix 5; these are named and included
in the entries only if strong inference exists for homology with genes of known function in other organisms.
The symbol Tel designates a locus determined by RFLP mapping of a telomeric DNA sequence
(1810)
. Tip is used to
designate a linkage-group terminus inferred to lie beyond the most distal genetic marker. A tip may be marked by the
breakpoint of a quasiterminal chromosome rearrangement.
In addition to gene loci, rearrangement breakpoints, and chromosome landmarks, entries are included that describe transposable
elements, both active (Tad) and degenerate (e.g., Pogo, Punt, Guest). Entries also are included for genes from other
organisms that have been integrated into the Neurospora chromosomes. Examples are amdS, ble, hph, and tk.
Allele numbers have been given in the entries for mutant genes only when there is some question about allelism, when two
alleles differ significantly in phenotype, or when a mutant has been referred to previously only by allele number.
Allele numbers of mutant strains can be found in FGSC listings or in the cited references. Reference (103) provides a
table relating allele numbers to the locus symbols that were first assigned in 1954. When a particular mutant gene
has not been definitely assigned to a locus, it may be symbolized temporarily by an appropriate base symbol followed by the
allele number in parentheses, as met(T70), mo(D301), or un(74E).
For map location and linkage analysis, we have cited mainly the most definitive data sources that establish the location of
a gene relative to its immediate neighbors. Earlier references that established linkage originally or that are
less precise have usually been omitted. Although they represent tremendous effort, the linkage data in these
publications now are of mainly historic interest. The older data may be found in the 1982 compendium
(1596)
or in refs.
(103),
(321),
(322),
(426),
(907),
(1255),
(1369),
(1383),
(1548),
(1585),
(1587),
(1592),
(1593),
(1603),
(1604),
(1928),
(1986),
and
(2014).
Original sources of RFLP data are not identified by Nelson et al.
(1447),
nor do we do so. The source usually is one of the publications referenced elsewhere in the same entry.
All of the important references cannot be cited in the entries, especially for loci that have been studied extensively.
References for the cloning and sequencing of genes are usually not cited because they can be obtained on-line from the
sequence databases. The publications that are cited should lead the reader to other significant literature.
Theses and abstracts have been cited only if they are known to contain pertinent information that has not been documented
adequately in a published reference. Unpublished sources are cited by number and are identified as such in the
References section, where they are listed alphabetically by name of the contributor and are interspersed with the
cited publications.
TABLE 1
Abbreviationsa
3-AT | 3-Amino-1,2,4-triazole (aminotriazole) |
bp | Base pair(s) |
BT | Beadle and Tatum wild type or origin (1471) |
cAMP | Cyclic adenosine monophosphate (cyclic AMP) |
cDNA | Complementary DNA |
DEAE | Diethylaminoethyl (-cellulose) |
EC | Enzyme classification number |
EDTA | Ethylenediaminetetraacetic acid |
Em | Emerson wild type (1471) |
EM | Electron microscope |
EMBL | Sequence database at European Bioinformatics Institute (EBI) |
EMBO | European Molecular Biology Organization |
EMS | Ethylmethane sulfonate |
EST | Expressed sequence tag |
FGSC | Fungal Genetics Stock Center |
FPA | p-Fluorophenylalanine |
G6PD | Glucose-6-phosphate dehydrogenase |
GenBank | Sequence database at National Center for Biotechnology Information (NCBI) |
HMG | High-mobility group (domain) |
ITS | Internal transcribed spacer |
kb | Kilobase pairs (duplex DNA), kilobases (single-stranded) |
kD | Kilodalton |
MMS | Methylmethane sulfonate |
MT | 4-Methyltryptophan |
NADH | Nicotinamide adenine dinucleotide |
MNNG | N-Methyl-N'-nitro-N-nitrosoguanidine |
NG | N-Methyl-N'-nitro-N-nitrosoguanidine |
NO | Nucleolus organizer |
NOR | Nucleolus organizer region |
NPD | Nonparental ditype tetrad |
nt | Nucleotide(s) or nucleotide pair(s) |
OR | Oak Ridge wild type (1471) or origin |
ORF | Open reading frame |
PABA | p-Aminobenzoic acid |
PCR | Polymerase chain reaction |
PD | Parental ditype tetrad |
PIR | Protein sequence database at the Protein Information Resource (PIR) |
RAPD | Random amplified polymorphic DNA (2214) |
rDNA | DNA sequence(s) specifying ribosomal RNA |
RFLP | Restriction fragment length polymorphism (1339, 1340) |
RIP | Repeat-induced point mutation (1886) |
RL | Rockefeller-Lindegren wild type (1471) |
rRNA | Ribosomal RNA |
SAM | S-Adenosylmethionine |
SEM | Scanning electron microscope |
SHAM | Salicylhydroxamic acid |
SPB | Spindle pole body |
SSR | Simple sequence repeat |
Swissprot | Protein sequence database at European Bioinformatics Institute (EBI) |
T | Tetratype tetrad |
Tn | Transposon |
UNM | University of New Mexico |
URF | Unassigned reading frame |
YAC | Yeast artificial chromosome |
TABLE 2
Synonymous Gene Symbolsa
|
|
Return to the 2000 Neurospora compendium main page
Return to the FGSC home page