Drosophila Breakout Group



Drosophila remains a key model organism for the development and application of genomics and proteomics. Its unique advantages include the exquisite genetic approaches and resources that have accumulated over almost a century. These are continually being augmented and improved upon as genetic technology advances. Drosophila is also host to a large and diverse community of researchers representing virtually every area of investigation of biological function. Drosophila is also a system that can be studied at relatively low cost and that has biological complexity comparable to that of a mammal. Many organ systems in mammals have well-conserved homologues in Drosophila, and Drosophila research has already led the way in providing new insights into cancer, neurodegenerative diseases, behavior, aging, complex multifactorial inheritance, and development. In addition, the past years of investment in Drosophila research and the soon anticipated completion of the genomic sequence will catalyze still more outstanding research and insights into normal and disease mechanisms.

The Drosophila community is in broad agreement that the opportunities and challenges in Drosophila genomics and genetics can be met successfully only with the targeted development of shared genetic resources including the complete genomic sequence, a set of full-length unigene cDNA sequences, libraries of transposon mutants in all genes, adequate databases, stock centers, complete genomic expression analyses, polymorphism databases, and related goals. We believe that the successful development of these resources will benefit all biomedical researchers and catalyze a rapid wave of discovery with significant applications to human biology and disease.

In anticipation of this meeting the Drosophila board came together and discussed the needs of the community and drew up a document containing their thoughts on this topic. This "white paper" was then distributed to the community through FlyBase and the Drosophila electronic bulletin board and comments solicited. The resulting list of priorities was then discussed at the non-mammalian model organisms workshop and amended. The resource needs of the community have been divided into two major sets. Those in the first set are so firmly interdependent that it was difficult if not impossible to assign individual priorities to each of the items but that they as a group should have the highest priority. The second group consists of needs that were perceived as important but were of a somewhat lower standing. The following list represents the needs that would have the greatest impact on the community and would be the most valuable and cost effective to the research enterprise.


Finish the Drosophila genome sequence by the end of 2000. The sequence should also be well annotated. Achieving this crucial goal is consistent with the Five Year Plan for the Genome Project and will require full funding of the already approved grant for sequencing the Drosophila genome ($44,000,000 for the three year period from 12/1/98—11/30/01). Finishing sooner would greatly accelerate important biomedical research progress and avoid lost opportunity costs.

Completion of high quality sequences of full-length cDNA clones corresponding to all genes in the genomic sequence (and their major alternative splice forms) and the assembly of a complete "unigene set" of all major expressed transcripts. The cDNAs should be made available in appropriate vectors in anticipation of their use in proteomic analyses. This "rosetta stone" will be crucial to fully comprehend the range of proteins encoded in the Drosophila genome. This goal can likely be accomplished for $8,000,000 and could be accomplished in 2 years.

Genetic resources. This top priority comprises three interrelated goals, the expansion of the stock centers and FlyBase being absolutely indispensable to the entire Drosophila genome project.

•• Generation of a collection of P-element or other transposable element insertions in all genes. We estimate that 50,000 different lines will be initially accumulated and could be generated using P-element or other transposable elements that are also capable of being used to generate controlled misexpression. The final size of this collection would number 20,000 lines and would be deposited in the stock center. This library of mutants in all genes will be an indispensable resource to all workers in the community and can be generated for $2,000,000. In addition to the mutant collection we propose the generation of a set of well-characterized transgenic lines expressing GAL4 in a large variety of different cell types, tissues, and developmental stages. These lines will be crucial to a great many labs for targeted misexpression, or limited restoration of expression, experiments with appropriate transposable element mutants or other UAS type constructs. It is imperative that these lines have well-documented expression patterns and that these patterns and the stocks are generally available to the research community. The cost of generation and characterization of these lines would be $150,000/yr for 5 years.

•• The collection of these new lines as well as the accumulation of novel transgenic stocks from other sources will require a significant expansion of the capacity of the stock centers. This goal will require expansion of the physical space and personnel to care for and send out the many genetic strains to the community. We envision that a national capacity in the range of 30,000 different stocks is a necessary minimum to accommodate the anticipated development of mutants in all genes. This goal will cost approximately $1,000,000 per year beyond current expenditures.

•• The database capacities available to the community must be significantly expanded. The current torrent of sequences requires annotation, linkage to the genetic maps and phenotypes, and links to databases of diversity. Currently FlyBase is serving its community very well. However, better accessibility to phenotypic information must be achieved. We also need to forge better links to the data bases of the other communities and ensure that the data housed in FlyBase is available to researchers in other systems. This objective can be accomplished for $3,000,000 per year rising to $3,500,000 per year over a 5 year period. It must be stressed however that continued support beyond that period is absolutely necessary. The above would be used to support the continuation but not the expansion of FlyBase into new areas such as the incorporation of new data types and research into new computational methods. This would require and additional $500,000 per year.


In addition to the above absolute top priorities listed above, the Drosophila community has reached general agreement on a second set of desiderata to exploit genomic and proteomic approaches to further our understanding of complex biological processes and the genetic basis of human disease. These are so nearly equal in importance that it would be arbitrary and inappropriate to assign ranks or priorities to these opportunities. The specific structural and functional resources comprising this second set of goals are as follows.


Gene product expression. This is a dual goal consisting of the:

•• Determination of expression patterns of all genes and coding sequences. These patterns would be determined at different developmental stages, in different tissues, and under different environmental conditions. This work will be facilitated by the availability of the full-length uniset of cDNAs cloned into standard expression vectors setting the stage for Drosophila proteomics, as well as through the application of antibody, epitope tags, or other appropriate reagents for intracellular protein localization. This work could be done for approximately $2,500,000 over a period of several years.

•• Development and application of high resolution, high sensitivity measures of protein expression and covalent modification in normal and mutant organisms. The availability of a complete Drosophila sequence coupled to rapid improvements in mass spectrometry-based protein sequencing and separation technologies will allow unparalleled insights into cellular and biochemical changes that will occur in various mutants. This emerging technology is poised to be applied to a model genome prior to attempting such efforts on vertebrates. Depending upon rates of technology development, this goal could be achieved for $1,000,000—$2,000,000.

Determination of the sequence of Drosophila virilis. The sequence of this related species will be crucial for the interpretation of the Drosophila melanogaster sequence and for helping to infer function and identify those characteristics of the Drosophila genome that are conserved and therefore likely to be important for function. Determining this sequence once the D. melanogaster sequence is near complete can be done relatively economically since the D. melanogaster sequence can help guide high priority regions of the D. virilis genome for sequence determination and interpretation. This goal can be achieved by starting with an investment of $2,000,000 per year to sequence regions of greatest interest in BAC clones, rising to $4,000,000 per year as the effort picks up speed. Eventually all of the D. virilis genome should be sequenced when it is cost-effective to do so, but the Drosophila community is convinced that there is tremendous insight to be gained by assigning high priority to immediate sequencing a few carefully chosen, highly targeted regions.

Creation of a standard set of cell culture models derived from various Drosophila tissues and developmental stages. A crucial resource for elucidating function for the various genes will be to have high quality cell culture models in which to conduct biochemical and cell biological analysis of mutants. Working to develop cell lines suitable for homologous recombination (for example, analogs of embryonic stem cells for targeted mutagenesis) and reintroduction into the organism is also an important component of this goal. These goals will require both the establishment of new permanent cell lines and the development of methods for readily preparing primary cultures from Drosophila organs and tissues. These goals could be achieved with an initial investment of $500,000—$1,000,000 per year.


Drosophila research would be greatly enhanced by the successful achievement of several other goals, two of which are generally applicable to all genome projects and two of which are specific to Drosophila. The two general goals are

The development of sequencing technology costing an order of magnitude less that that currently practiced. This would make many highly meritorious genomic sequencing projects feasible, such as the complete sequencing of D. virilis and D. simulans and perhaps other species. It would also enable a quantum jump in the application of genomic approaches to evolutionary and population genetics.

The creation of appropriate source(s) of molecular materials for acquisition by researchers at a reasonable cost without restriction on their use.

The two Drosophila-specific goals are:

The development of a system of gene replacement by homologous recombination, which would rival the importance of P-element germline transformation in the genetic manipulation of Drosophila.

The development of an efficient, cost effective, high-throughput system for cryopreservation of any stage(s) of development or cell type(s) from which the organisms can be resuscitated, for the purpose of long-term storage of genetic resources and relief of constantly increasing demands on the stock centers.

<< Back Table of Content Next >>