2nd Workshop on Genomic and Genetic Tools for the Zebrafish
April 1-2, 2002
On April 1-2, 2002, in Rockville, MD, the Trans-National Institutes
of Health (NIH) Zebrafish Coordinating Committee, working with
the External Zebrafish Advisory Panel, sponsored a workshop to
evaluate priorities for zebrafish research. The specific purpose of
the meeting was to discuss the various resources needed by the
community to maximize the benefit of the genome sequence. The
group weighed the relative importance of each resource, as well
as the feasibility of establishing it. There were three sessions,
Resources, Mutagenesis and Full-length cDNA, which were chaired by
David Grunwald, Alex Schier and Gerry Rubin, respectively. The session
chairs summarized the presentations and discussions to form this report.
Report from the 2nd Workshop on Genomic and Genetic Tools for the Zebrafish
April 1-2, 2002
External Zebrafish Advisory Committee:
Aravinda Chakravarti, Geoffery Duyk, David Grunwald, Nancy Hopkins, Alexander Schier, Eric Weinberg
Resources - David Grunwald, chair
The agenda of the "Resources" session of the meeting was devoted to discussion of three types of resources that might be used to augment current analyses of gene function in zebrafish: microarrays, antibodies, and wildtype strains. In addition to these topics, community discussion emphasized the critical importance of expanding or developing other resources.
1. Microarrays - Microarray analysis is generally used to compare patterns of gene expression in two tissue samples. John Ngai (UC Berkeley) and Ken Cho (UC Irvine) summarized new results from their laboratories showing that gene arrays could be used very effectively as an embryological tool, for example for the discovery of new genes associated with the organizer or other specific tissues. Dr. Ngai's work relied on identification of genes expressed in wildtype but not mutant embryos at somitogenesis stages, whereas Dr. Cho's work utilized dissection methods to identify genes expressed exclusively in the organizer tissue.
Application of microarray analysis to zebrafish can have a fundamental impact on our ability to associate biological functions with the ESTs and genes identified in the genome projects. As Drs. Cho and Ngai showed, simply the ability in zebrafish to generate very large numbers of embryos that develop synchronously allows for comparisons that will be significantly informative about processes in early development. As Dr. Ngai showed, the availability of mutants with deficiencies in particular tissue types, known signaling pathways, or regulatory transcription factors, allows one to use comparative analysis of gene expression patterns to quickly associate genes with potential functions. In situ hybridization and morpholino antisense methods can be used to rapidly test hypotheses generated from the microarray experiments.
Implementation of microarray analysis in the zebrafish is challenged by the lack of resources available to investigators. Shawn Burgess (NIH) described the emergence of a commercial tool for performing microarray analyses. Two companies are currently producing libraries of 70-mer oligonucleotides that represent zebrafish gene sequences and that could be arrayed by individual investigators. Because the technology is new and the community of zebrafish researchers interested in this resource is small, the costs of generating or acquiring libraries of target gene sequences for microarray analysis is prohibitive. The first-generation library of 16,700 sequences is estimated to cost approximately 60-70 thousand dollars.
A recommendation was made that NIH provide a funding mechanism, perhaps in the form of supplemental grant allocations, that would support collaborative ventures undertaken by several investigators whose common interest is focused on the exploitation of microarrays.
2. Wildtype Strains - David Grunwald (U of Utah) presented two arguments for directed investment by the community in the maintenance of defined wild type strains of diverse genetic backgrounds. First, polymorphisms present in wild type strains are necessary for mapping mutations. In the absence of a directed effort to maintain strain polymorphisms, each individual laboratory must maintain polymorphic strains. In the absence of strict quality control methods, a degree of inbreeding will likely take place and inevitably polymorphisms will be lost. Second, as demonstrated by work in human populations and in the mouse, polymorphisms within wildtype populations can be effectively used to study multifactor traits. To date, the focus of zebrafish research has been on single gene functions, particularly the functions of individual genes that are essential for viability. An emerging focus of genomic research is to understand the constellation of gene activities that contribute to phenotype. The current status of naturally occurring polymorphisms available within laboratory zebrafish strains and accessible to researchers presents severe limitations on future research in this area.
Monte Westerfield (U of Oregon; Director, Zebrafish Stock Center) presented a breeding strategy designed to maintain high levels of polymorphism within strains. Dr. Westerfield noted that the strategy was expensive, required quality control, and was not funded as part of the responsibilities of the Stock Center. Instead, wildtype strains are maintained through informal agreements with individual investigators who then distribute animals to the Stock Center.
A recommendation was made that scientific evaluation be made of the best breeding strategy for maintaining strain polymorphisms. In addition, methods should be implemented to assess the extent of inbreeding within stocks. The community was vocal in its request that the Stock Center be responsible for preserving lines with high degrees of polymorphism. For this to be accomplished, new funding directed specifically at supporting this goal is required.
3. Antibodies - In contrast to the determined effort to recover each gene expressed during zebrafish and annotate it in terms of its sequence and temporal and spatial pattern of expression, there is no centralized effort to generate and characterize antibodies. In discussion of this issue, members of the community felt that development of an antibody resource should be initiated by individual investigators, perhaps working in concert with NCRR.
4. Bioinformatics Support - A big concern of the community is that investment in generating gene and genome sequence information is not balanced with investment into tools that support analysis of these sequences. At least three different kinds of analysis were identified as needing to be expanded:
- Analysis of the primary genome sequence being generated by the Sanger Center.
- Comparative analysis of the zebrafish genome with the other genomes, notably the human, mouse, and pufferfish. Efforts need to be made to use information from other species to help: i) identify gene coding sequences; ii) identify regulatory sequences; and iii) identify extent of synteny across species.
- Annotation of expression. One project underway involves posting on ZFIN in depth characterizations of the developmental expression of individual genes. The community supported the idea of expanding efforts to generate a database linking genes to functions and expression patterns.
Systematic gene disruption projects - Alexander Schier, chair
1. Introduction - Alexander Schier first presented an overview of current approaches to systematically study gene function in vivo, comparing approaches used in zebrafish to the ones employed in other model organisms, particularly in Drosophila. Principally, four main strategies have been used to analyze gene function in vivo. Forward genetics (going from a mutant phenotype to the affected gene), reverse genetics (going from a molecularly defined gene to isolate mutations in the gene), misexpression and overexpression approaches, and the use of interfering agents that block the activity of particular gene products. Reverse genetics in zebrafish has involved large-scale screens using chemical mutagens, especially ENU, and insertional mutagens, such as those based on retroviruses.
These screens have generated large collections of mutants that affect embryogenesis and larval development. Estimates vary, but at least 2,500 loci seem to be essential for early development, 1/3 of which are involved in specific processes. Smaller-scale ENU screens are currently underway and are in part funded by NIH. Few modifier and no clonal screens have been performed.
Reverse genetic approaches are still in their infancy, but preliminary studies indicate that such strategies are feasible in zebrafish using ENU and retroviruses. However, in stark contrast to the mouse, no homologous recombination system using ES cells, primordial germ cells or somatic cells (for cloning) has been developed. Misexpression screens have not yet been performed in zebrafish, but should become more feasible upon generation of a comprehensive cDNA set. In stark contrast to Drosophila, no EP system has been developed, whereby genes can be activated in a given tissue by combining a enhancer-GAL4 transgene with UAS inserts randomly distributed in the genome, allowing activation of a nearby gene.
Finally, interfering strategies have dramatically enhanced the field in the last 1.5 years. In particular, the use of morpholino anti-sense oligonucleotides has led to the rapid, albeit not completely reliable, testing of gene function by blocking translation (or splicing) of specific genes. The use of small molecule inhibitors also promises to change the field. In strong contrast to Drosophila, C. elegans and mammalian cells, no successful application of (s)iRNA has been reported.
2. Point mutagenesis - Mary Mullins discussed the status and future of ENU screens. She highlighted the high rate of mutagenesis achieved with ENU (1 in 1000 mutagenized genomes carries a loss-of-function mutation in a given gene) and estimated that the large scale Boston and Tuebingen screens (6000 genomes screened) identified ~80% of loci that lead to an obvious embryonic phenotype. She emphasized the need for further, more sophisticated screens to study specific processes. According to her estimates, a screen of 2000-3000 genomes might isolate mutations in 50% of the loci that might lead to a phenotype affecting a particular structure or process, requiring ~$1 million. The lesson from Drosophila is that screens will be performed as long as scientists will study novel processes amenable to genetic analysis. A major drawback of current ENU-induced mutations is the difficulty of isolating the affected gene. Dr. Mullins emphasized the need for a well annotated genome sequence and a unigene set to make this step more efficient.
3. Resequencing - Len Zon presented studies in Ron Plasterk's lab that aim to develop reverse genetics using ENU to isolate mutations in a gene of interest. Using rag1 as a target gene, Dr. Plasterk's group resequenced 2,600 F1 fish derived from mutagenized fathers and found several point mutations in rag1. One of the mutations introduced a stop codon and could be recovered in the F2 generation. Rag1 mutant fish were defective in V(D)J recombination. These studies demonstrate that mutations in most if not all zebrafish genes could be isolated using this methodology, but costs are very high at $10,000-20,000/gene. Dr. Zon also presented similar studies using p53 as a target. The overall frequency of mutations appears to be ~1/500kb.
4. Insertional mutagenesis - Nancy Hopkins provided an update on the large scale screen using retroviruses as an insertional mutagen. Current virus titers allow 30-40 integration/founder fish. Using F1 fish with 5-10 inserts to generate F2 families, Dr. Hopkins' group has screened 4300 F2 families (53,000 inserts) and recovered 630 mutants in ~ 500 loci. Dr. Hopkins estimated that this might correspond to 1/5 of loci leading to a lethal phenotype during early development. To date, her group has isolated ~220 different genes disrupted by insertions after cloning the genes disrupted in about 275 mutants. Dr. Hopkins also estimated that it might be feasible to isolate disruptive insertions in 5000 genes by screening 12,000 F1 fish (60,000 insertions) via junction sequence isolation. She mentioned that his project would be highly feasible with an annotated genome. The isolation of a retrovirus in a specific gene seems more daunting, since one million inserts would have to be screened, but clever pooling strategies might simplify the task.
5. Knock-down strategies - Steve Ekker summarized the progress on morpholino (MO)-mediated knock down screens. He argued that MOs are quite specific, since 20/21 MOs tested were able to phenocopy previously isolated mutants. Mistargeting is relatively rare (4/21 cases). He presented the first results of a screen designed to knock down 200 randomly chosen genes using two independent morpholinos. Of 22 genes, 5 gave a specific phenotype, indicating that this method might be quite powerful in defining genes involved in early development. He estimated that a screen of 5000 genes could be completed in 2.5 years, requiring ~$3.5 million, much of which is needed for morpholino purchase. He emphasized that identification of targets for MOs requires comprehensive information of gene sequences, particular 5'UTRs.
6. Knock-out technologies - Shuo Lin presented recent results demonstrating that fertile zebrafish can be cloned by transfer of nuclei from cultured fibroblasts. Although technically difficult, this approach might be combined with homologous recombination to generate knock-out fish. He also presented a high-throughput strategy to modify BACs (in collaboration with Nat Heintz at Rockefeller University).
7. Summary - At a summary session, Dr. Schier synthesized the presentations to propose specific initiatives for the future. He discussed that forward screens using ENU or retroviruses will continue to be important to genetically dissect biological processes in zebrafish. He argued that reverse approaches are feasible and will dramatically change the field. To isolate a mutation in a specific gene, detecting point mutations in F1 fish derived from mutagenized males will be the method of choice for the foreseeable future. Since resequencing is quite costly, other approaches to detect nucleotide changes should be tested (e.g. using Cel1 a la Henikoff in Arabidopsis).
To disrupt 5,000 "random" genes, retroviruses appear to be the reagents of choice. The pilot screen using morpholinos against 200 genes will determine if this approach should be expanded to 5000 or more genes. In the interim, it might be wise to test other antisense reagents (e.g. other oligo chemistry, siRNA) to cut costs due to the high price of morpholinos.
For technology development, Dr. Schier emphasized the need to further improve cloning techniques and primordial germ cell cultures, develop homologous recombination, and develop insertional agents (transposons, retrovirus, lentivirus) that make genome manipulation more efficient. He suggested that large and targeted funds are mainly needed for good cDNA sets and bioinformatics.
Screen grants still have a hard time in "hypothesis-driven" study sections and should continue to be funded by PAs and helped through the system by members of the trans-NIH zebrafish committee. He expressed his hope that grants for technology development and resource development can be funded by NCRR and R21 funding mechanisms, and supplements might support special opportunities such as isolation of ENU-induced knock outs, microarrays, or transgenics.
The discussion among participants centered mainly on the continued need for grant support of screens and technology development. It was emphasized that screens for behavioral or adult phenotypes with disease relevance are particularly important to broaden the appeal of the zebrafish system. Members of the NIH stressed the need for members of the zebrafish community to serve on study sections and suggested that small business grants might fund initiatives such as resequencing of F1 fish to isolate knock outs in specific genes. Several participants argued that the best way for funding screens and technology development are special initiatives, such as RFAs.
cDNA and EST resources - Gerald Rubin, chair
1. Significance of full-length cDNA collection - In the summary session, David Grunwald noted that the greatest issue of concern raised by the community is the challenge of translating the information generated by genomics initiatives into tools and resources that can be used for experimental analyses that identify and test gene functions. It was felt that the single most useful tool would be creation of a unigene collection of full length cDNAs that would be mapped onto the genomic sequence. The combination of library, sequence, and mapping information would support:
- Identification of genes, gene structure, and regulatory regions;
- Identification of 5' ends of transcripts for morpholino antisense design;
- Design and utility of microarrays;
Enhance the retrieval of cDNAs for many purposes, including supporting the cataloguing of gene expression.
Similarly, Dr. Schier's summary of the mutagenesis session put the highest priority on a comprehensive cDNA set and an annotated genome, because these resources are the basis for the rapid identification of mutated genes or targets for reverse approaches.
2. Presentations - The session on cDNA and EST resources included three presentations about projects to collect full-length sequenced cDNAs. Elise Feingold described the NIH's Mammalian Gene Collection aimed at human and mouse cDNAs. Steve Johnson reported on the zebrafish EST project, and estimated that his collection may include several thousand full-length clones. S. Mathavan introduced a new zebrafish full-length cDNA project being launched at the Genome Institute of Singapore.
3. Summary - A consensus emerged from the discussion from the first two sessions that a collection of full-length sequenced cDNAs - a Zebrafish Gene Collection (ZGC) - will be of widespread use to the research community and is a high-priority. However, at current costs (based on estimates derived from the MGC project) this resource would cost between $15 million and $20 million to generate. Thus it is important to consider approaches that could provide some of this utility in the short term while providing materials that will be eventually required to produce a complete cDNA collection. There was less consensus on the emphasis that should be placed on generating a set of open reading frames cloned in a versatile vector system such as the GateWay vectors.
a. Modified plan - One suggested plan, based on a total budget of $5 million, would be to spend $2.5 million on generating 5'-ESTs from new libraries (see below). This should allow the generation of between 750,000 and 1 million ESTs. These could be collapsed into a putative unigene collection of about 50,000 clones, which could be rearrayed, followed by sequence verification (by 5' EST sequencing) and 3'EST sequencing. The cost of this second stage would be about $500,000 (informatics, rearraying and 2 ESTs per clone). These sequences would be of great utility in annotating the genomic sequence and would provide the community with a very useful clone collection. Based on the experience with the Drosophila Gene Collection and MGC, it is anticipated that this resource would contain clones representing at least one splice product for about 60-70% of genes.
The remaining $2 million could be devoted full-length sequencing of a selected set of clones. The estimated cost of this sequencing would be between $200-300. per clone, allowing 6,500 to 10,000 clones to be sequenced. These could be selected based on community interest in particular classes of proteins, similarity to human disease genes, etc. More sophisticated strategies for library normalization will be needed to get beyond this level of gene coverage in a cost-effective manner.
b. Libraries - A number of high quality libraries have or are being generated. The most useful of these for generating the Zebrafish Gene Collection (ZGC) are likely to be libraries made using a cap-trapping protocol such as that devised by the Riken group. (This conclusion is based on extensive experience in the DGC and MGC, and is consistent with the more limited data presented on Zebrafish libraries). It would make most sense to generate new ESTs from these libraries. This would have an additional advantage in providing the ZGC in a limited number of cloning vectors, which would simplify use of the resource in functional genomics studies.
c. Project coordination - There are at least two successful models for coordinating large-scale full-length cDNA sequencing projects. The MGC mechanism utilizes a central informatics group at the NCBI and several distributed library construction and sequencing centers. Taking advantage of this already established infrastructure, with its quality assurance mechanisms, would seem to be a very attractive and low risk option. Alternatively, it is possible to centralize these functions in a single genome center as has been done for the DGC.
d. Clone distribution - Concern was expressed for the difficulties encountered establishing reliable mechanisms for clone distribution relying solely on commercial distributors. This is a problem being faced by all EST and cDNA sequencing projects and it can be anticipated that a solution will be worked out by the MGC. However, it may be worth considering a community based molecular stock center as well.
Up to Top