EXPLANATORY FOREWORD

In this section, information is gathered on each established nuclear gene and on chromosome features such as centromeres, tips, and the nucleolus organizer. Entries are arranged alphabetically by symbol. Some categories of mutants are prefaced by a generic entry that concerns the entire group (e.g., al, arg, het, mat, mus, rec). Synonyms are cross-referenced to the preferred symbol. Abbreviations are defined in Table 1. Synonymous gene symbols are listed in Table 2.
    Within each entry for a gene locus, the name follows the symbol. The linkage group is then indicated and the arm is specified, if known. This is followed by linkage data relating the gene to other loci. If the gene has been cloned and sequenced, accession numbers are given for the sequence. The pSV50 cosmid library number or other reference number of clones containing the sequence may also be given. The phenotype is described in a subsequent paragraph, including such attributes as enzyme deficiency (including EC number), dominance, interaction with other genes, fertility, and stability. The described phenotype is that of mutant strains unless stated otherwise. Scoring information, technical applications, and alternate names may be given.
     At the time the previous compendium was prepared, a gene could not be recognized or mapped until a phenotypically recognizable mutant or variant had been discovered. In contrast, genes can now be recognized and defined solely on the basis of DNA sequence, without knowledge of the phenotype. Some of the genes discovered in this way have no obvious null phenotype. For many others, the null phenotype is unknown. The gene name may then be based on the inferred gene product or the known function of orthologues or homologues in other organisms. Making the name explicit in such a situation would ideally require addition of the suffix "-like" (for example, ras-like), but for brevity the suffix is usually omitted, just as the suffix "-requirement" is omitted, although it is implied, when auxotrophic mutants are named. Similarly, qualifying adjectives have often been omitted. When the function of a gene is inferred from sequence homology, "structural gene" usually implies "putative structural gene." Likewise, "encodes" is used in place of "inferred to encode." A gene symbol without a superscript is used either to designate a locus or to symbolize a mutant allele at that locus. Which usage applies in a particular instance should be obvious from the context.
     In the paragraph on linkage, recombination values may be given that originated from more than one cross. Differences between these values may be due to sampling error (sample size is usually less than 100). The differences may also reflect differences in trans-acting rec genes, which are known to regulate recombination in specific local regions. Values given as percentages or as numerical fractions indicate the frequency of reciprocal recombination per meiotic product, unless it is indicated explicitly that the value is for reciprocal crossing-over events per tetrad. (The number of crossing-over events per tetrad is double that of crossovers per random meiotic product for the same interval.)
     When gene order is based on duplication coverage or on three-point crosses, the sequence of loci is often known with confidence even though interval lengths are unknown or imprecisely known. Where linkage information is given in the format, "Locus c is between a, b and d, e," the word "and" separates markers to the left of c from markers to the right.
     RFLP data from two widely used crosses are tabulated in ref. (1452) and are reproduced here as Appendix 3. These come either from 18 tetrads (the first cross) or from 18 random isolates (the second cross). Data from these crosses have been extremely valuable for assigning cloned sequences to a linkage group and for indicating their approximate location. The two RFLP-marked crosses were not intended to provide information for precise mapping. Because of the small numbers, estimates of distance are imprecise and gene order is often uncertain.
     Results from the first RFLP-scored cross are presented here as numbers of asci of the three types, PD (parental ditype), NPD (nonparental ditype), and T (tetratype), using the format: T, NPD/total. (The total number is PD + NPD + T.) Because each crossover event in an ascus recombines only two of the four chromatids, the frequency of crossover meiotic products (on which map distance is based) is half the tetratype frequency when there are no NPDs. NPDs are produced when four-strand double exchanges occur in an interval between markers. If NPDs are present, the best estimate of crossover frequency in the interval is obtained by multiplying the number of NPDs by 6 and adding this to the number of T's before dividing by 2. (The multiplication compensates for cryptic multiple crossovers that result in T or PD asci rather than NPD.) A significant excess of PD asci over NPD asci constitutes proof of linkage [see refs. (645) and (1547)].
     Results from the second RFLP cross, which involves random ascospore isolates, are given as a numerical fraction, crossover products/ total.
     In deriving recombination estimates from the RFLP data of both crosses, we have omitted from the calculation any NPD tetrads (cross 1) or double-crossover progeny (cross 2) in intervals that contain few or no single crossovers. In making the omission, we assume that the apparent double crossovers are spurious, having resulted from experimental or recording errors or from gene conversion, rather than from bona fide reciprocal crossing over. Such errors or conversions appear to be abundant in the existing RFLP data.
     Rearrangement breakpoints define loci that have often been crucial for determining gene order. Rearrangements have not been given separate entries in the main section, however, but breakpoints that are well-defined are included in lists of mapped loci and are shown on the genetic maps (Appendix 2). Most of the breakpoints listed are from insertional or quasiterminal rearrangements, which have been widely used to establish gene order and to map chromosome tips and centromeres. Rearrangements with breakpoints that are located less precisely are not included in the lists in Appendix 2 but are described in Appendix 4, which provides full information on all characterized rearrangements, whether mapped precisely or mapped only to linkage group.
     Rearrangement symbols have been abbreviated in the locus entries, lists, and maps. For example, In(IR;IL)OY323 is shortened to In(OY323). Multiple breakpoints in the same chromosome are distinguished using superscripts, for example, In(OY323)L and In(OY323)R. Rearrangement symbols are written in full in refs. (700) and (1578).
     With a few exceptions, entries are not provided for sequences of unknown function ("anonymous" loci), even if they have been RFLP-mapped, nor are these loci listed with the maps in Appendix 2. RFLP linkage data are given in Appendix 3. Gene loci recognized on the basis of EST sequences are given in Appendix 5; these are named and included in the entries only if strong inference exists for homology with genes of known function in other organisms.
     The symbol Tel designates a locus determined by RFLP mapping of a telomeric DNA sequence (1810) . Tip is used to designate a linkage-group terminus inferred to lie beyond the most distal genetic marker. A tip may be marked by the breakpoint of a quasiterminal chromosome rearrangement.
     In addition to gene loci, rearrangement breakpoints, and chromosome landmarks, entries are included that describe transposable elements, both active (Tad) and degenerate (e.g., Pogo, Punt, Guest). Entries also are included for genes from other organisms that have been integrated into the Neurospora chromosomes. Examples are amdS, ble, hph, and tk.
     Allele numbers have been given in the entries for mutant genes only when there is some question about allelism, when two alleles differ significantly in phenotype, or when a mutant has been referred to previously only by allele number. Allele numbers of mutant strains can be found in FGSC listings or in the cited references. Reference (103) provides a table relating allele numbers to the locus symbols that were first assigned in 1954. When a particular mutant gene has not been definitely assigned to a locus, it may be symbolized temporarily by an appropriate base symbol followed by the allele number in parentheses, as met(T70), mo(D301), or un(74E).
     For map location and linkage analysis, we have cited mainly the most definitive data sources that establish the location of a gene relative to its immediate neighbors. Earlier references that established linkage originally or that are less precise have usually been omitted. Although they represent tremendous effort, the linkage data in these publications now are of mainly historic interest. The older data may be found in the 1982 compendium (1596) or in refs. (103), (321), (322), (426), (907), (1255), (1369), (1383), (1548), (1585), (1587), (1592), (1593), (1603), (1604), (1928), (1986), and (2014). Original sources of RFLP data are not identified by Nelson et al. (1447), nor do we do so. The source usually is one of the publications referenced elsewhere in the same entry.
     All of the important references cannot be cited in the entries, especially for loci that have been studied extensively. References for the cloning and sequencing of genes are usually not cited because they can be obtained on-line from the sequence databases. The publications that are cited should lead the reader to other significant literature. Theses and abstracts have been cited only if they are known to contain pertinent information that has not been documented adequately in a published reference. Unpublished sources are cited by number and are identified as such in the References section, where they are listed alphabetically by name of the contributor and are interspersed with the cited publications.

TABLE 1
Abbreviationsa
3-AT 3-Amino-1,2,4-triazole (aminotriazole)
bp Base pair(s)
BT Beadle and Tatum wild type or origin (1471)
cAMP Cyclic adenosine monophosphate (cyclic AMP)
cDNA Complementary DNA
DEAE Diethylaminoethyl (-cellulose)
EC Enzyme classification number
EDTA Ethylenediaminetetraacetic acid
Em Emerson wild type (1471)
EM Electron microscope
EMBL Sequence database at European Bioinformatics Institute (EBI)
EMBO European Molecular Biology Organization
EMS Ethylmethane sulfonate
EST Expressed sequence tag
FGSC Fungal Genetics Stock Center
FPA p-Fluorophenylalanine
G6PDGlucose-6-phosphate dehydrogenase
GenBank Sequence database at National Center for Biotechnology Information (NCBI)
HMG High-mobility group (domain)
ITS Internal transcribed spacer
kb Kilobase pairs (duplex DNA), kilobases (single-stranded)
kD Kilodalton
MMS Methylmethane sulfonate
MT 4-Methyltryptophan
NADH Nicotinamide adenine dinucleotide
MNNG N-Methyl-N'-nitro-N-nitrosoguanidine
NG N-Methyl-N'-nitro-N-nitrosoguanidine
NO Nucleolus organizer
NOR Nucleolus organizer region
NPD Nonparental ditype tetrad
nt Nucleotide(s) or nucleotide pair(s)
OR Oak Ridge wild type (1471) or origin
ORF Open reading frame
PABA p-Aminobenzoic acid
PCR Polymerase chain reaction
PD Parental ditype tetrad
PIR Protein sequence database at the Protein Information Resource (PIR)
RAPD Random amplified polymorphic DNA (2214)
rDNA DNA sequence(s) specifying ribosomal RNA
RFLP Restriction fragment length polymorphism (1339, 1340)
RIP Repeat-induced point mutation (1886)
RL Rockefeller-Lindegren wild type (1471)
rRNA Ribosomal RNA
SAM S-Adenosylmethionine
SEM Scanning electron microscope
SHAM Salicylhydroxamic acid
SPB Spindle pole body
SSR Simple sequence repeat
Swissprot Protein sequence database at European Bioinformatics Institute (EBI)
T Tetratype tetrad
Tn Transposon
UNM University of New Mexico
URF Unassigned reading frame
YAC Yeast artificial chromosome
aAdditional abbreviations are given in the legends of Figs. 5, 11, 18, and 40 and a footnote to Table 3.

TABLE 2
Synonymous Gene Symbolsa
Replaced symbol Symbol now in use
ac ace
ace-6 suc
acon-1 fl
acp aac
acpi acu-16
act cyh
adg arg-11
age-1-3 so
age-3 al-1
alx-1 b azs; has
amr nit-2
ami sor-4
amy(SF26) exo-1
ANT-1 b azs; has
arg-7 arg-4
ac ace
ace-6 suc
acon-1 fl
acp aac
acpi acu-16
act cyh
adg arg-11
age-1-3 so
age-3 al-1
alx-1 b azs; has
amr nit-2
ami sor-4
amy(SF26) exo-1
ANT-1 b azs; has
arg-7 arg-4
arg-8 pro-3
arg-9 pro-4
arg(CD-15) cpc-1
arg(CD-55) cpc-1
arg(RU1) am
arg(RU3) arg-13
argR pmb (?)
arom aro
arp1 ro-4
asc(DL95) mei-1 (?)
asc(DL243) mei-1 (?)
asco lys-5
asp asn
aspt asp
atp-9 oli
aur al-1
bas pmb (?)
bat pmb
Ben Bml
bis pk
bli-7 eas
bm-1 pmb
c het-c
c col-3
c cot-4
c cy
can cnr
ccg-2 eas
ccg-7 gpd-1
ccg-12 cmt
cla-1 frq7
col-3 bn
col-7 rg-1
col-13 vel (?)
col-l4 sc
col-15 vel (?)
col-le le-1
col(C-L2b) mel-1
cox-5 cya-4
crp-1 cyh-2
cys(oxD1) cys-15
cyt-3 cyt-4
cyt-U-10 cyt-4
cyt-12 cyc-1
cyt-U-14 cyt-4
cyt-U-19 cyt-19
cyt-20c un-3(55701)
cyt-U-28 cyt-25
cyt(289-56)c un-3(55701)
cyt(297-24) cyt-21
d het-d
dgr-3 sor-4
dgr-4 sor-3
e het-e
en-am en(am)-2
er rg-1
f su([mi-1])-1
fas cel
fdu-1 ud-1
flm-1 os-1
flm-2 os-4
frq-5 prd-1
G gul
gla sor-4
glm gln-1
glp-3 ff-1
gluc-2 gluc-1
gly glp
gpi-2 gpi-1
grg-1 ccg-1
ham so
hist his
hs hom
hsp83 hsp80
hspe hsp80
hsps-1, hsps hsp70
hsps-2 grp78
i en(am)-1
i het-i
inos inl
iv ilv
kex scp
lni-1 cpc-1
lni-2 lacc
lox lao
lysR su(mtr)-1 (?)
m pe
nac met-6
matd rug
mbic Bml
mc-1 nuo19.3
me met
Replaced symbol Symbol now in use
mel-2 bal
mep-3 mep-1
mep(3) mep-1
met-4 cys-10
meth met
mfA-1 ccg-4
mig tre
mik-2 nrc-1
mms mus
mo-3 sk
mo(M111) at
mo(KH160) shg
mo(NM213t) bld
mo(P1163) dr
mo(P2402t) un-20
mo(R2441) cwl-1
mod-5 cpc-1 (?)
moe-1 sk (?)
mom tom
morph mo
mpp-2 pep
MS5 nmr
mt mat
mt mtr
mts cpc-1
mus(SA60) mei-2
nac cr-1
NC-ras ras-1
neu mtr
nik-1 os-1
nit-5 nit-4
nuh-4 uvs-3
nuh-4 mus-9
nuo9.6 acp
orn-1 arg-5
orn-2 arg-6
orn-3 arg-4
ovc cut (?)
pab-3 pab-2
pcon nuc-2
pdc-1 cfp
pdx-2 pdx-1
Ph-mod-D sor-4
phe-3 phe-2
phen phe
pmn mtr
Pm-N mtr
pph-2 cna-1
ppz-1 pzl-1
prl-1 oli
prol pro
prt pts
ps15-1 ndk-1
psp ndk-1
put-1 spe-1
pyr-5e pyr-1 pyr-3
Q, q nic-1
ras-2 smco-7
rco-3 dgr-3 (?)
Rsp R
rec-4 rec-2
rec-5 rec-2
rec-w rec-2
rec-x rec-3
rec-z rec-1
ro-5 ro-4
ro-8 ro-4
ro-9 da (?)
ros al-3
Rsp R
s arg-12S
t scot (?)
sdv-10 asd-1
sdv-15 asd-3
smco-2 sc
snz pdx-1
sor-A sor-1
sor-B sor-2
sorr-14 sor-5
sorr-15 sor-4 (?)
sorr-17 sor-3
sorr-19 sor-5
sor(DS) sor-4
sor(T9) sor-4
spco-1 col-4
spco-2 wa
spco-3 spco-7
spco-13 spco-7 (?)
spco-1 3 moe-2 (?)
su-2, etc. su(trp-3td2)-2, etc.
sum su(pe)
su-B su(bal)
su-C su(col-2)
su(pro-3) arg-6
sup(...) su(...)
sup-1, -3, etc. su([mi-1])-1, -3, etc.
sw per-1
t scot
t(289-4) cyt-22
td trp-3
thi-lo thi-4
thr-1 ile-1
tru uc-5
try trp
tryp trp
tub-2 Bml
tyr-3 tyr-1
tyr-s tys
tyr(NM160) phe-1
un(STL6) fls
un(b39) un-5
un(44409) un-1
un(46006) un-2
un(55701) un-3
un(66204) un-4
un(83106) un-6
uve-1 phr
vac-5 htl
van pho-4
ylo-3 fl
ylo-4 al-1
   


Return to the 2000 Neurospora compendium main page
Return to the FGSC home page