thermophila is a microbial eukaryote and a model system of proven
utility. Each cell contains two genomes. The transcriptionally active
macronuclear genome consists of ~250 macronuclear chromosomes, generated
by site-specific, genetically-determined fragmentation of the 5 germline
chromosomes of the transcriptionally inert micronucleus. These fragments
represent a natural, complete, high copy number YAC-like library of the
expressed genome. Efforts are already underway to RAPD map and to order
the macronuclear pieces into a complete physical map exploiting the unique
15-base pair sequence which is necessary and sufficient to cleave the
micronuclear chromosomes into macronuclear pieces. We believe that the
macronuclear chromosomes offer unique advantages for sequencing the entire
genome by direct shotgun sequencing and request funds for a pilot project.
Funds are requested to assess the feasibility of this approach and to
develop important complementary technologies for functional analyses,
as well as for database and community training resources.
Because ORFs are
readily delineated - 25 % GC average genomic content - and introns are
relatively small and rare, we think that direct genome sequencing could
make it unnecessary to obtain a set of fully sequenced cDNAs. The pilot
sequencing project will tell us if our belief is correct. So at the present
time we are not requesting cDNA sequencing.
of resources needed:
1. To explore direct
shotgun sequencing of the entire genome by sequencing macronuclear chromosomes
(genome size = 200 Mb, total number of macronuclear chromosomes ~ 250,
average size ~ 700 Kb, range 300 Kb - 3.3 Mb). A pilot project will
sequence 10-15% of the genome. The initial work will focus on those
macronuclear chromosomes known to contain genes for proteins which are
likely to be tightly co- regulated (and thus possibly on the same macronuclear
chromosome) with other proteins (e.g. genes for regulated secretion
of stored protein products, genes for ciliary dyneins, etc.) ORFs would
be identified and knocked out to determine the fraction of the expressed
genome and the fraction of essential genes. The genes chosen are a subset
of those found in Tetrahymena that are involved in processes of significance
for human biology and disease that cannot be studied in yeast. The work
will use a combination of available technologies for separating macronuclear
chromosomes, shotgun cloning and sequencing and can be started now.
Estimated duration: one-two
years. Amount requested: $3 million total.
2. Improve genomic
technology in Tetrahymena. This would include:
of higher resolution maps of macronuclear and germline chromosomes.
of highly engineered strains:
maintenance of free plasmids
Estimated duration, 3 years.
Amount requested: $2 million total.
3. Database resources
to make sequences and results readily available not only to the Tetrahymena
community but also the entire scientific community. For the time being
it would seem most cost effective to affiliate with an existing genome
database, adopt their format and augment their resources to support
Duration: on-going. Amount
requested: $200,000 per year.
4. Yearly course to
train interested biologists on Tetrahymena-specific genetic and molecular
Duration: on-going for a
5 year trial period. Amount requested: $30,000 per year.