The NIH Director
PRIORITY SETTING FOR MOUSE GENOMICS AND GENETICS RESOURCES
William Dove and David Cox, Co-Chairs
(Revised 6/11/98(1) )
The NIH Director, Harold Varmus, convened a distinguished group of national and international scientists for the purpose of defining and establishing priorities for the production of mouse genomics and genetics resources. Approximately 60 scientists met for three days in Bethesda, MD in March 1998 to discuss two major areas: structural genomics (mapping and sequencing resources) and functional genomics (specifically mutagenesis). The goal was to enable and facilitate research by the entire community of investigators who use the laboratory mouse as a tool for understanding mammalian biology. The recommendations for funding includes four components: (1) structural analysis of the mouse genome; (2) functional analysis of mouse biology; (3) resources; and (4) training. The recommendations are in the form of estimated direct costs for the first year and length of effort in years. Below is a summary of these recommendations.
I. STRUCTURAL ANALYSIS OF THE MOUSE GENOME:
TOTAL DIRECT COST FOR FIRST YEAR: $22.2 M (19.2 M new)
IA. PHYSICAL MAPPING RESOURCES
Three types of physical maps were identified that would be crucial for scientists interested in cloning genes. High resolution maps, those with thousands of mapped "sequence tagged sites" (STS) markers, were considered the most useful for the rapid isolation of genes. Approximately 100,000 markers will be needed. The combined U.S. and European effort have generated approximately 40,000 markers, therefore an approximately 60,000 additional markers will be needed. These new markers should be a combination of random and gene-based markers and can be generated from ESTs, BAC end sequences, YACs, and plasmids. The cost of generating the additional markers has been included in the cost of generating the BAC and RH maps.
i. Bacterial Artificial Chromosome (BAC) Libraries/Map.
Two BAC libraries of 10X coverage should be constructed using different restriction enzymes; the strains chosen should be those that have been designed as reference strains. Five to ten additional libraries of lower coverage should also be constructed from a variety of commonly used mouse strains.
A fingerprinted BAC map of 10X depth anchored with 20,000 markers
of which half are gene/EST-based and the other half polymorphic
and random markers should be generated.
DIRECT COST FOR FIRST YEAR: $3.0 M---DURATION: 3 YEARS
ii. High Resolution Radiation Hybrid (RH) Map
Radiation Hybrid Panel. A radiation hybrid panel of approximately 100 hybrid cell lines should be developed using a background other than hamster in order to minimize cross-hybridization.
The radiation hybrid map generated should have a minimum resolution of approximately 100 kilobases. DIRECT COST FOR FIRST YEAR: $5.0 M---DURATION: 3 YEARS
iii. Yeast Artificial Chromosome (YAC) Resource. A low resolution
YAC map of the mouse genome currently exists. This resource can
be use to develop additional markers for RH and BAC mapping and
to fill gaps in BAC maps and genomic sequences.
DIRECT COST FOR FIRST YEAR: $0.2 M---DURATION: 5 YEARS
IB. GENETIC MAPPING RESOURCES
There are three resources that would further define genetic variation in the mouse so as to facilitate the study of complex or quantitative traits: (a) genotyping additional mouse strains; (b) low resolution (5cM0 single nucleotide polymorphism (SNP) map; and (c) consomic mouse strains.
i. Genotyping Additional Mouse Strains. A genetic map using simple
sequence length polymporphisms (SSLP) as markers exists and is
the basis for the current reference mouse genetic map. There are
many existing inbred mouse strains that could provide a rich resource
for the analysis of complex traits, but many are underutilized
because they are not well characterized genetically. The value
of these strains in understanding diseases would be increased tremendously
if approximately 6,000 SSLP markers were typed on 50 additional
DIRECT COST FOR FIRST YEAR: $1.0 M---DURATION: 1 YEAR
ii. Low Resolution (5 cM) Single Nucleotide Polymorphism (SNP)
Map. A relatively new resource, SNPs, is becoming available to
researchers interested in common diseases of humans. Whereas such
a map has been shown to be of great value to researchers studying
human diseases, its value in mouse biology has not been demonstrated.
A low resolution mouse map with 1,500 to 2,000 SNP markers would
be adequate to determine whether this would be a useful resource
for mouse research. Before a SNP map can be generated, there are
two resources or technologies which must be in place: discovery
and detection of SNPs. Improving the technology to identify and
score SNPs and at the same time developing a low resolution SNP
map should be given a high priority.
DIRECT COST FOR FIRST YEAR: $1.0 M---DURATION: 2 YEARS
iii. Matched sets of mouse strains each carrying a different pair
of chromosomes with defined patterns of variation from another
mouse strain (Consomic Strains). One set of consomic strains is
currently being developed. These strains have the potential to
advance quantitative trait loci mapping and to readily map genes
responsible for the many physiological, biochemical, immunological,
neurological, developmental, and behavioral differences which have
already been documented between two strains.
DIRECT COST FOR FIRST YEAR: $1.0M--DURATION: 2 YEARS
IC. POSITIONAL CLONING RESOURCES
There are several reasons to support the sequencing of mouse cDNAs
at this time, particularly in light of the realization that the
mouse genome will probably not be sequenced within the next five
years. Sequencing of full-length cDNAs would enhance research about
structure/function, facilitate gene mapping, and identify a significant
number of the genes in the genome. Presently, the technology for
generating full-length cDNA libraries does not exist. However,
a large number of mouse cDNA libraries (less than full-length)
are available. In addition, unlike the human cDNA libraries, mouse
cDNA libraries can be generated for the various developmental stages
of the mouse embryo. The use of these libraries for generating
ESTs (gene tags) for mapping is a valuable resource for researchers
interested in mouse and human biology.
i. 3'-end Sequencing of ESTs for Mapping. Over 100,000 mouse ESTs
have been sequenced, but most of the information is from the 5'end.
The value of ESTs is knowing their mapped positions on the genome;
this requires information from the 3'end. Thus, sequencing of the
3'ends of mouse ESTS should be a high priority.
DIRECT COST FOR FIRST YEAR: $1.5M--DURATION: 3 YEARS
ii. Mapping of ESTs/cDNAs. The clones in the cDNA libraries gain enormous value if placed on a map. ESTs can be used as markers on BAC and radiation hybrid maps and are strong candidates for each gene that gains functional prominence due to an ENU-induced point mutation.
There is a small effort to develop a gene map of the mouse by
mapping a limited number of ESTs onto radiation hybrids. Whereas
this will be a useful initial resource, to be maximally useful
to scientists interested in gene discovery, additional ESTs will
need to be mapped.
DIRECT COST FOR FIRST YEAR: $1.0M--DURATION: 3 YEARS
iii. Specialized Libraries for Identifying Missing Genes. The
generation of any new cDNA libraries should focus on obtaining
most, if not all, transcripts. Investment in new technologies for
normalization and/or obtaining low abundant transcripts should
DIRECT COST FOR FIRST YEAR: $0.5M--DURATION: 3 YEARS
iv. Full-Length cDNA Libraries. The technology for generating
full-length cDNA clones is in the developmental stages and continues
to represent a challenge to the scientific community. Research
projects to significantly improve existing or generate new technology
for isolating full-length cDNAs should be supported.
DIRECT COST FOR FIRST YEAR: $2.0M--DURATION: 3 YEARS
In order to get a complete understanding of mouse biology, a comprehensive
study of the genome is required. Although sequencing priority should
be given to the human genome at least initially, it is important
that mouse genomic sequence be given relatively high priority.
Given the many uses and users of mouse genomic sequence, it was
decided that there should be increased capacity to generate mouse
genomic sequences at a high-throughput, highly accurate, and efficient
manner. It was recommended that the support for increasing the
sequencing capacity for the mouse begin soon. In the first year
of that process, 12 Mb should be sequenced, ramping up to 400 Mb
within 5 years, and that a reference mouse genomic sequence be
completed by 2008.
DIRECT COST FOR FIRST YEAR: $6.0 M; DURATION: 10 YEARS
II. FUNCTIONAL ANALYSIS OF MOUSE BIOLOGY: TOTAL DIRECT COST FOR THE FIRST YEAR: $17.1 M
IIA. GENOME-WIDE MUTAGENESIS AND PHENOTYPING
As the Human Genome Project progresses and the sequence of more human and mouse genes is determined, the function of a large number of these genes will not be predictable by sequence and expression alone. Phenotype-driven mutagenesis screens provide an important approach to understanding the function of genes. In the mouse, ENU produces mutation rates sufficient to perform genome-wide mutagenesis screens. These mutagenesis screens are highly efficient and effective in isolating mutants systematically and comprehensively for available phenotypes. What is missing, however, are standardized protocols for mutagenesis and improved tools and a variety of assays for characterizing phenotypes. Therefore, the following is recommended:
i. Standardization of Mutagenesis Protocols. For the various mutagenesis
screens currently in use, there is significant variation in the
mutation rates, strains used, dosages, and breeding protocols employed.
Ideally, to understand theses variations, an assessment at the
molecular level would be appropriate. However, because of the expense
involved, it was recommended that screening protocols for ENU be
standardized for different strains of mice (used for different
phenotypes), that the procedures for ENU mutagenesis be disseminated
widely, and that training in ENU mutagenesis procedures be established.
TOTAL DIRECT COST FOR FIRST YEAR: $0.5M DURATION: 1 YEARS
ii. Centers for ENU Mutagenesis and Phenotyping. One way to ensure the systematic and comprehensive analyses of phenotypes is to establish ENU Mutagenesis and Phenotyping Centers. Such Centers might include a mutagenesis core, phenotypic screening core (and technology development to improve the procedure), mapping core; database core for dissemination of information, sperm cryopreservation and DNA bank of G1 mice, and mutant/sperm distribution core. These Centers might handle a steady-state annual population of 20,000 progeny of ENU-mutagenized mice, of which 2,000 would be first-generation progeny permitting the detection of strong dominant alleles. The Centers would also provide access for investigators outside the focal emphases of the Center to screen for mutant lines.
DIRECT COST FOR FIRST YEAR: $9.0M DURATION: 3 YEARS
IIB. DEVELOPMENT OF PHENOTYPING PROTOCOLS
i.Technology Development. Complete characterization of the phenotype of mutagenized mice is a major challenge in all mutagenesis studies. There is an urgent need for improved technologies for mouse physiology, pathology, antibody markers, reporter mice, expression arrays, behavior analysis, etc. These technologies should be developed within ENU Centers and by individual investigators.
DIRECT COST FOR FIRST YEAR: $6.0M DURATION: 5 YEARS
ii. Technology Transfer. Researchers working with the rat have
developed phenotypic screens for rats that could be adapted for
the mouse. Efforts should be made to transfer that technology to
the mouse system taking into consideration the smaller size of
DIRECT COST FOR FIRST YEAR: $1.0M DURATION: 5 YEARS
IIC. TARGETED MUTAGENESIS
Targeted mutagenesis is a well-established technology and widely used in the community. Standard approaches for generating null mutations are now being complemented by generation of allelic series of mutations and tissue-specific and induced mutations. Issues of standardization and resource availability, enhancement of emerging technologies and new tools for in depth phenotypic analysis to fully exploit the mutations being generated is needed. Improvements were suggested in the following two areas.
i. ES lines from different mouse strains need to be validated
for specialized uses. A better understanding of the biology of
ES cells could assist in such derivation.
DIRECT COST FOR FIRST YEAR: $0.1M DURATION: 5 YEARS
ii. There is a need to study mutations on different strain backgrounds,
requiring backcrossing from mouse strain 129 to other strains.
Molecular genotyping in conjunction with the construction of congenic
strains (speed congenics) will assist this approach. The development
of such a resource will be more to deliver a useful resource to
DIRECT COST FOR FIRST YEAR: $0.5M DURATION: 3 YEARS
III. RESOURCES: TOTAL DIRECT COST FOR FIRST YEAR: $7.8 M
There are a variety of resources and needs that if addressed on a global scale would facilitate research and result in significant savings. Several such resources and needs were addressed and include the following:
There are many more mouse mutants that exist than can be maintained as live stock because of the cost and the demand or lack thereof. Cryopreservation of sperm and ovary would facilitate future studies of these mutants and reduce the costs of maintaining live animals. There are two areas that need support.
i. Technology development projects to investigate the conditions
that would make cryopreservation and recovery of sperm and ovary
following cryopreservation very effective and efficient should
be supported. An assessment of pathogen transfer should also be
included in these studies.
DIRECT COST FOR FIRST YEAR: $2.0 M; DURATION: 2 YEARS
ii. Develop a storage facility for frozen gametes. The facility
should have a visitor?s laboratory to allow access to cryopreservation
technology and a recovery resource for scientists who do not have
access to the technology.
DIRECT COST FOR FIRST YEAR: $1.5 M; DURATION: 5 YEARS
IIIB. REPOSITORY OF LIVE MOUSE STRAINS
The maintenance of live mouse strains is very expensive, especially for small laboratories. A repository that would maintain the most commonly used strains would facilitate research. A repository that would be capable of accommodating 250 new strains per year would be a valuable resource. Because of the varying and varied needs of the community, it would be essential to have a sub-committee of scientists to evaluate which strains to maintain as live stock and which strains to preserve as frozen germ plasma based on community needs. DIRECT COST FOR FIRST YEAR: $3.0 M; DURATION: 5 YEARS
IIIC. TRANSFER OF TARGETED ALLELES ONTO STANDARD BACKGROUND
Gene targeting experiments have been conducted on a variety of
cell lines, thus limiting the uses of these resources by many scientists.
It is recommended that the existing set of targeted alleles be
transferred onto a chosen standard background, 129/Sv. The 129/SvEvTac
may be the appropriate substrain, but this must be confirmed by
the community of users. A complementary background strain (C57BL/6)
should be employed in parallel, enabling the production of a uniform
F1 hybrid background for each targeted allele, on which phenotyping
is particularly robust.
DIRECT COST FOR FIRST YEAR: $0.5 M; DURATION: 2 YEARS
IIID. DATABASE RESOURCES
The significant amount of information that will be generated by the availability of genomic and genetic resources to facilitate research will result in a lot of data being generated. A variety of databases exists for collecting data. The usefulness of this data by others will depend upon the ease by which the generators of large data sets can submit their data to public databases and the ability of databases to communicate with each other. There are two areas needing attention:
i. Criteria to ensure that (a) large data sets can be easily deposited
in a centralized public repository and (b) the centralized public
repository captures information from or develop interfaces to a
variety of databases and presents it to users in a manageable form.
DIRECT COST FOR FIRST YEAR: $0.3 M; DURATION: 1 YEARS
ii. Expand the animal models database to include disease states
DIRECT COST FOR FIRST YEAR: $0.5 M; DURATION: 4 YEARS
IV. TRAINING: TOTAL DIRECT COST FOR FIRST YEAR: $2.2 M
i. Cryopreservation has the potential to save many dollars. The
technology is still developing, but additional individuals need
to be trained in this new technology. It is recommended that three
training sessions be held each year to accommodate the need.
DIRECT COST FOR FIRST YEAR: $0.2 M; DURATION: 3 YEARS
ii. There are very few animal pathologists who are involved in
mouse research. A training program to support veterinary fellows
in a two year fellowship in mouse pathology should be implemented.
DIRECT COST FOR FIRST YEAR: $2.0 M; DURATION: 5 YEARS
V. OTHER ISSUES
During the course of the meeting, several issues were brought up that were not fully discussed. Below is a list of these topics that need further exploration and discussion.
1. Oversight Sub-Committees. The format for several types of oversight sub-committees was suggested that would: (a) identify reference mouse strains for genomic and cell line reagents; (b) determine which mouse genomic DNA regions should be sequenced first; and (c) discuss the best ways to establish and coordinate centralized databases
2. Mouse Husbandry Costs. The issue of costs in maintaining mice was discussed. Some participants thought that the high cost of maintaining these animals was due to federal regulations. Participants were encouraged to send their comments to the Director, NIH, who is looking into this matter.
|I. STRUCTURAL ANALYSIS OF THE MOUSE
A. Physical Mapping Resources
II. FUNCTIONAL ANALYSIS OF MOUSE BIOLOGY
B. Repository for 250 New Strains Per Year, with Continued Breeding
C. Transfer of Existing Set of Targeted Alleles onto Standard Background
D. Database Resources
TOTAL FOR MOUSE RESOURCES (FOR YEAR 01):
$ 7.8 M
$ 2.2 M
BREAKOUT GROUP A: Physical Mapping
· Generate fingerprinted BAC map of 10X depth, anchored with 20,000 markers (approximately 80% coverage). Half the markers should be gene/EST-based, half should be a combination of polymorphic and random markers. Besides its immediate use for positional cloning, this resource could make a starting point for a future sequence-ready map. The resource should be developed using the 129 SVEV Tac strain.
· Generate additional BAC libraries to facilitate construction of map. Two libraries (using different restriction enzymes) of 10X coverage, in vectors that provide for ease of quality control, maximum adaptability for future purposes and multiple application. Five to ten additional libraries of 3X coverage constructed from different widely used mouse strains (males).
· Software development for fingerprint analysis, and for making the resource accessible to the community, especially smaller labs.
Cost estimate: $9 M Duration: 3 years
Radiation Hybrid (RH) Panel/Map
· Develop a new RH panel of 100 kb resolution using a background other than hamster to minimize cross-hybridization. This may require some lead time to construct, so should be a high priority.
· Generate ~60,000 new STS markers (from a mix of BAC end sequences, YACs, and plasmids) to add to the ~45,000 unigene markers now being developed. These markers can also be used for the fingerprinted BAC map.
· Place all ~100,000 markers on the new RH panel.
Cost estimate: $15 M Duration: 3 years
· Make existing YAC resources more accessible for screening, for example by providing well-defined pools.
Cost estimate: $1 M Duration: 5 years
BREAKOUT GROUP B: cDNAs/ESTs
1. Sequencing Additional 3' ESTs to Facilitate Mapping and Gene Identification
A. New Library Generation
¨ Use Normalization/Subtraction
¨ Cost: $0.5 Million Duration: 2 Years
A. EST Sequencing
¨ 200,000 3' EST Sequences
¨ For Unique Clones - 5' EST Sequences
¨ Cost: $1.5 Million Duration: 3 Years
1. Full-Length cDNA Sequencing - Major Priority
A. Technology Development and Generation of Representative Full-Length cDNA Libraries
¨ Cost: $5 Million Duration: 3 Years
A. cDNA Sequencing
¨ 3% of Genome (estimated coding sequences)
¨ 90 Mb
¨ Cost: $45 Million @ $0.50/bp Duration: 3-5 Years
1. Prioritization of Strains/Tissues for Library Generation and Quality Control
¨ Community Steering Committee
Breakout Group C: Genetic Mapping
The breakout group concluded that the highest priority for the future of genetic mapping in the mouse was a SNP map. This would greatly simplify genotyping for several different purposes, e.g. positional cloning, identifying QTLs, analysis of LOH. It was also considered to be of very high priority for the SNPs to be in the public domain as quickly as possible, so that the academic community could have access to them. The group distinguished two aspects of the SNP question, SNP discovery and SNP deployment (scoring).
A specific recommendation was made with respect to SNP discovery. Initially, a relatively small number of SNPs, on the order of 1500-2000, should be generated. This would make it likely that in any particular pair of strains used in a cross, at least 300 - 400 SNPs would be informative, allowing a genome scan at about 5 cM resolution. In the first instance, SNPs should be chosen to give a uniform distribution across the genome, and to be useful for about 10 different diverse strains. It was estimated that a project of this magnitude would cost about $1-2M and it was recommended that two such projects be supported, in order to provide healthy competition for the SNP developers. It was thought that this could be done in two years, at which time the value of the SNP approach could be determined and, if warranted, a more dense collection could be developed. . It was considered important for the initial set of SNPs to be integrated with existing maps, but to leave the mechanism for doing to be determined by competition
With respect to deployment, it was felt that the development of the initial set of SNPs did not imply any specific technology for further use. Rather, the group recommended the support, chosen through a competition, of a variety of new approaches to scoring, that would lead to widespread use by a large number of small laboratories, as well as by the major users. It was also noted that technologies for scoring SNPs would also be applicable to the detection of induced point mutations.
As for the choice of strains, the group thought that it would be most appropriate for recommendations about specific strains to be left to a small committee of experts, that would include representatives from the NIH, from the Jackson Lab, and from Europe. It was felt that a list developed by this group should then be circulated to the wider community for a period comment and validation. Several factors were suggested for consideration in selecting the strains: genetic diversity, usefulness in studying a variety of phenotypes, usefulness in mutagenesis and knockout screens.
A second priority for genetic mapping resource was the development of additional sets of consomic strains. Currently, one such set is being developed, a set of A/J chromosomes onto a B6 background. The group thought that the recombinant consomic technology had the potential for major advances in QTL mapping, and that its usefulness would extend well beyond the traditional mouse genetics community in the sense that it would allow physiologists, developmental biologists, neurobiologists, etc. to rapidly and simply map genes that determine or modify phenotypes of interest. In this case, the recommendation was that initially NIH support the development of about 3-5 sets of particularly useful strains. Although the ultimate choice of strains should be made in a competition, it was felt a matched set of 129 and B6 consomic strains would be extremely useful. It was estimated that each set would cost about $500,000.
A third recommendation was to characterize more inbred strains by typing them with a large number of SSLP markers. It was thought that existing inbred strains provide a rich resource for the analysis of complex traits, but many are underutilized because they are not well genetically characterized. It was estimated that a one-time project typing about 6000 markers on 50 strains would cost about $1M and take about a year, and had the potential for opening up many more strains for the study of mutations, QTLs and modifiers.
Finally, a radiation hybrid panel that would allow markers to be ordered at higher resolution than the existing RH panel would be of value. Again, generating this resource would be a one-time effort, and was estimated to cost about $500,000.
BREAKOUT GROUP D: Genomic Sequencing
How will the mouse sequence be used?
This must be understood to address the questions below. Several different areas were identified:
1. positional cloning -- 1-2 BACs for 25-50 projects a year
2. region specific mutagenesis - 3-5 Mb
3. comparative sequences to define other elements, breakpoints etc.
4. reverse genetics -- small targeted regions
5. reference tool to interpret human sequence
For small projects directed at getting specific genes, the group concluded that means should be devised to enable small groups to obtain the needed sequence. This might, for example, involve skimming, followed by more accurate sequence for the gene of interest. This scale of sequencing should be brought within reach of most labs. The other projects however require more systematic sequencing on a larger scale and would provide a long term resource to the community. Comparative analysis of mouse and human sequence was the most compelling use in the eyes of the group.
What quality for the sequence?
This questions pits coverage versus completeness. There was considerable
discussion of alternatives, including options such as 2x shotgun
of BACs across the genome, or leaving a limited number of gaps
to reduce costs. However, there was greater enthusiasm for a commitment
to significant amounts of mouse sequence in the next few years
and to get the whole mouse genome done in a reasonable time (vide
infra). The group concluded that the temporary advantages of quicker,
broader coverage were outweighed by the added costs incurred in
achieving the ultimate goal of the complete reference mouse sequence.
Accordingly, there was broad agreement that the mouse sequence
standards should be the same as for human.
What is priority of completing the mouse relative to the human?
The greatest value would result from parallel efforts on human and mouse. However, practical limits of resources and capacity and the commitment to complete the human genome by 2005 will necessitate that priority be given to human sequencing. Nonetheless, the greatly added value that the mouse sequence will give to the human make it imperative that mouse sequencing not lag far behind human. It is important to begin a significant effort on mouse sequencing now, and expand that as rapidly as resources, capacity and the commitment to sequence the human allow. For example the group felt that it was more important to do more mouse sequence than finish the human early.
What priorities should be followed in the choice of mouse genomic regions to be sequenced?
If individual labs are enabled to do sufficient sequencing to support positional cloning efforts, they will establish their own priorities. For larger systematic efforts, it would make most sense to focus on regions of the mouse genome syntenic to those already sequenced in human. Larger regions of particular biological interest should also be considered. Selection of specific regions that match these broad guidelines is best left to individual labs in dialog with larger scale sequencing groups. A tool similar to the Human Genome Sequencing Index maintained by the NCBI should be constructed for the mouse genome and the syntenic regions of the human and mouse map should be linked. Such a tool will be useful to mouse researchers in determining what groups are mapping and sequencing the human genome in regions syntenic to their regions of interest and it will help to track and coordinate the mapping and sequencing of the mouse.
Is strain a consideration for generating a reference mapsequence?
Overall, the group felt there was little scientific reason for favoring one strain over the other. The one mandate was that the sequence be derived from a single well characterized strain, and not be a mosaic of different strains. The one practical consideration is that the BAC clones from 129 are already heavily used. Whatever strain is chosen, the selection should be made known, and the clones and other resources must be readily available.
What should be the priorities for resource generation in the sequencing area?
Highest priority should be given to the development of a sequence ready map. The estimated cost of building a BAC overlap clone map was estimated to be approximately $7M. Initial fingerprinting, contig building and chromosomal positioning could be done over three years. Efforts at closure would extend over two more years.
The next priority is to begin systematic sequencing of the mouse genome. Every effort should be made to increase available sequencing capacity to allow the amount of mouse sequence completed annually to grow from the 10-12 Mb in the next year to 400 Mb in 2005. The goal should be to obtain 1 Gb of mouse genomic sequence in this period. The remainder of the mouse genome should then be completed in the next three years. Thus by 2008 a complete reference mouse sequence should be available.
Finally means should be found to facilitate the sequencing of small regions (100-200 kb) by R01 funded labs. Equipment seems less of a problem than access to methods and informatics support to handle the relatively large data sets. Perhaps a resource similar to that in the UK could be established.
Breakout Group E: Targeted Mutagenesis
Targeted mutagenesis is a well-established technology and widespread in the community. Its value for analyzing gene function is undisputed. Standard approaches for generating null mutations are now being complemented by generation of allelic series of mutations and tissue-specific and induced mutations, which further enhance the utility of this approach. The group focussed on issues of standardization and resource availability across the community, enhancement of emerging technologies and the importance of new tools for in depth phenotypic analysis to fully exploit the mutations being generated.
A. Standardization of resources for gene targeting.
There was a strong consensus for use of a standard genetic background for gene targeting. Current use suggests that 129SvEvtac is the line of choice, for stability of lines and ease of maintaining inbred lines. Validated lines exist on this background.
- Support a mechanism for affordable, reliable distribution of validated 129SvEvtac (Hprt-)ES lines to the community.
Cost: $100k/yr, ongoing
It was also recognized that there is a need for validated ES lines from different strains for specialized uses. A better understanding of the biology of ES cells could assist in such derivation.
- Support establishment of validated ES cell lines from other strains and stable female cell lines.
Cost: $500k/yr for 3 years
There is a need to study mutations on different strain backgrounds, requiring backcrossing from 129 to other strains. Speed congenics assist in this approach, but are not easy for the average lab.
- Develop set of SSLPs for easy generation of speed congenics onto standard backgrounds (some may be available already)
Cost: $100k/yr for 2 years.
Issues of targeting technology were considered. The rate-limiting step in targeting is making the vector. The availability of BAC libraries in the 129SvEvtac strain is very important in this regard. One study suggests that it may be possible to use the BACs directly to make targeting vectors, which would then have very long regions of homology to enhance targeting efficiency.
- Test the feasibility of direct generation of targeting constructs in BACS and the possibility of targeting by zygote-injection.
- Cost: $300k/yr for 3yrs.
Possible strategies for generating genome- wide directed mutations were considered, such as sequence-tagged gene trap libraries or libraries of ENU-mutagenized sperm. The group did not support a new public-domain genome-wide gene trap effort and felt that the market place would determine the usefulness of Lexicon's current library. Any large-scale ENU screening center should be encouraged to archive sperm and DNA for possible sequence-based screening.
B. Conditional targeting strategies
The power of tissue-specific and inducible mutagenesis for dissection of the full roles of a gene is enormous. There was a strong feeling that the Cre-Lox system was working and that the main impediments to further use are the availability of fully characterized lines expressing Cre in a defined spatial and temporal manner. It was felt that many of the problems of mosaic Cre expression could be overcome by insertion of Cre into the genome so that it is controlled by endogenous regulatory elements. In addition, good ubiquitous lox-stop-lox reporter lines to validate Cre expression are needed. For inducible targeting, the nuclear receptor- Cre fusions show considerable promise and are being worked on extensively.
- Support effort to make versatile ubiquitous excision reporter lines and lines of mice expressing Cre in defined spatial and temporal patterns by knock-ins or enhancer trap approaches
Cost: $2M/Yr for 4 years
There is a clear case for a centralized facility to distribute the Cre lines and to create a database of the Cre lines. This should be costed under the resource facility for mutant distribution. There is still an outstanding patent issue with Dupont, which could inhibit distribution of this resource. The group considered the site-specific recombinase technology to be so important that, if this issue is not resolved soon, there must be an immediate switch to an alternate enzyme system to avoid wasting effort.
This is a major challenge in all mutagenesis efforts. There is an urgent need for improved technologies for mouse physiology, pathology, antibody markers, reporter mice, expression arrays, behavior analysis, etc. It was felt that individual Institutes will need to invest in this to obtain the full value of the mutant resource for understanding mouse biology as it impacts on human disease.
- That the NIH convene a meeting this year to discuss enabling technologies for mouse phenotyping in different organ and tissue systems
D. Additional issues
1. Need for continued support for facilities to store and distribute mouse mutants.
2. Importance of cryopreservation for mutant stocks.
3. Need for mutation database, fully annotated with phenotypic information.
4. Need to address animal husbandry and whether current costly procedures are necessary for mouse welfare.
Breakout Group F: ENU Mutagenesis
Rationale: Phenotype-driven mutagenesis screens provide an important approach to understand the function of genes. As the genome project progresses and the sequence of more human and mouse genes are determined, the function of a large number of genes will not be predictable by sequence and expression alone. To understand the function of genes, mutations have provided a powerful approach to understand to this problem. In the mouse, the supermutagen, ENU, produces forward mutation rates (~1/650 per locus per gamete) sufficient to perform genome-wide mutagenesis screens at significant coverage so that, on average, mutations can be obtained in almost any gene. Phenotype-driven ENU mutagenesis screens provide a complimentary approach to gene targeting methods because both gain-of-function and loss-of-functions alleles can be obtained and importantly no pre-existing knowledge of the target genes is required. This forward genetic approach has been widely used and proven in model genetic organisms, however, in the mouse large-scale screens need to be performed in order to determine whether ENU mutagenesis can be applied broadly to understand gene function in the mouse. There was clear consensus from experience within the Breakout group and others that ENU screens are highly efficient and effective in isolating mutants for particular phenotypes. In order to apply this approach in a more systematic manner, the group recommends that a number of new initiatives should be launched (which are described below).
Ongoing large-scale ENU mutagenesis screens
Currently there are two large-scale ENU screens that have been initiated in Europe. Rudi Balling at the GSF is conducting an integrated screen of 40-50,000 mice for a number of dysmorphic and blood-assayable biochemical phenotypes. Steve Brown at Harwell is conducting a similar size screen of neurobehavioral phenotypes. In the first 6-12 months of the screens, significant numbers of mutants have already been obtained (about 150 mutants from 14,000 mice in a one generation dominant screen at GSF, and about 60 mutants out of 3000 mice screened at Harwell). Thus, the feasibility of large-scale "Center-based" integrated ENU screens appear feasible, efficient and productive.
Questions to the Breakout Group.
1. How can the efficiency of mutagenesis be monitered and improved?
In discussing the various screens, the group concluded that there is a significant variation in the mutations rates observed, strains used and breeding protocols employed. Experience at Oak Ridge (Monica Justice) and Madison, WI (Bill Dove and Alexandra Shedlovsky) suggests that strict quantitation of ENU doses is essential for achieving high mutagenesis rates (rates of 1/200 per locus per gamete). There is a need for protocol validation with different strains of mice (used for different phenotypes) and a need for dissemination and training of ENU mutagenesis procedures to make this approach more widely useful to the scientific community.
The group felt that adequate methods for assessing mutagenesis rates at the molecular level are not cost effective at this time. It would required scanning Megabase pair regions thoughout the genome to sample ENU-induced mutations adequately (the estimated molecular rate for ENU would be about 1 per 100,000 bp).
2. How can the cloning of point mutations fe facilitated?
The group strongly endorsed the recommendation of the structural genomics groups that a complete physical map of the mouse genome in BACs be made and that a high priority should be placed on full length cDNA cloning and sequencing (to high quality sequence standards).
3. What approaches are needed to improve the ability to identify recessive loss-of-function alleles?
The group endorsed ENU screens using targeted deletions in regions of the mouse genome that will have high priority for sequencing because of synteny with human, gene rich regions or regions of particular interest. In addition, large-scale recessive screens would be encouraged using the Centers of Excellence recommended below.
4. How can screening for phenotypes and validation of mutations be improved?
See recommendations below.
5. What the the priorities for chemical mutagenesis and what is the cost vs. the benefit?
I. The NIH should support systematic ENU screens in mice. Both Centers and individual investigators should be supported. A significant investment is required to test the feasibility of ENU screens as an approach to study the function of genes and pathways for broadly based phenotypes.
A. Establish Centers of Excellence for ENU screening that have a biological focus (with self-selected groups of investigators and consortia). These Centers would have:
¨ Mutagenesis core
¨ Phenotypic screening core (and technology development)
¨ Database core for dissemination of information
¨ Mapping core
¨ Sperm cryopreservation and DNA bank of G1 mice
¨ Mutant/sperm distribution provision
Establish at least 3 Centers of Excellence at $1-2M per year for 5 years.
A. Initiate an RFA to support individual investigators for other ENU screens that are not optimal for the Center format.
¨ Specialized phenotypes
¨ These R01's would have access to core facilities at Centers.
II. Initiate RFA to establish baseline characterization of a set of standard (consensus) phenotypes (behavior, neurological, motor, endocrine, biochemical, etc.) on a set of inbred strains of mice. This would provide a foundation of information for physiological parameters for strains of mice. The phenotypic characterization would be coordinated with the Centers and information would be disseminated by the database core facilities. In addition, the RFA should include development of technology for phenotypic screens in mice. For example, to transfer assays currently using the rat to the mouse. $1-3M per year for 5 years.
A systematic validation of ENU protocols on a standard set of mouse strains should be performed. ENU protocols and information should be disseminated.
$0.5M for 1 year.
BREAKOUT GROUP G:
Chromosomal Strategies, Insertions and Transgenes
1. What is needed to further the development of insertional mutagenesis?
An ideal insertional mutation system in the mouse should provide sequence-tagged Insertion of the germline on a genome wide scale. Such a system would facilitate functional genomics by providing a mutation at every locus, whose phenotype could be evaluated in the whole animal. Although efforts have been made to generate a P-element type transposon based in vivo insertional mutagen for the mouse, no such system is presently available. It is therefore recommended that high priority be assigned to obtaining an ES cell based resource that would provide a sequence-tag for each locus in the mouse genome. Such a resource would make it possible to scan the sequence-tag database for any gene of interest and to order the corresponding targeted ES cell line. This resource would permit the larger community of investigators to utilize genomic resources efficiently, and would be much more cost effective than the current effort to generate targeted knock-outs in individual laboratories, at an estimated cost of approximately $50,000 per knockout. Of course, this resource would only provide one null allele per locus, and would not obviate the need for ongoing targeted mutation of loci that are studied in depth. It would, however, be of enormous value to a large community of researchers. Making such a system available is of highest priority, and would be worth an investment of 10 M per year for several years. (Nearly this much is probably going into generating a few hundred knockout mice per year in small labs.)
2. What are the prospects for developing a collection of segmental deletions across the mouse genome?
Two ES cell based resources capable of generating deletions at marked sites throughout the genome are currently under development. The Merck Genome Initiative is supporting work in John Schimenti's lab to generate 500 ES cell lines with randomly inserted selectable TK inserts. The positions of the inserts will be determined by plasmid rescue of flanking DNA that can be mapped on a backcross panel. These ES cell lines will be described on a database and made freely available on request. Deletions from fixed insertion sites are also being generated by Alan Bradley at Baylor. Experience with these resources during the coming years will indicate whether additional resources will be needed in the future.
3. What is the value of analyzing polymorphic modifiers and QTLs, and how can their study be advanced?
Analysis of loci contributing to complex traits in the mouse was seen as an important future activity with broad biological applications. Many of the physical mapping resources discussed here will be important for QTL analysis. These include the generation of BAC libraries and maps, the development of SNPs, the generation of consomic lines of mice, the targeted ES cell library, and the development of expression arrays. All of these initiatives were strongly supported for their applications to QTL analysis.
4. What improvements in phenotyping are needed to facilitate functional genomics in the mouse?
A great deal of basic biological characterization of standard strains of mice will be required as background for assessment of phenotypic alterations in mutant mice. There is a serious shortage of expertise and manpower for sophisticated evaluation of mutant phenotypes. The following initiatives are proposed to address this problem:
a) A training program to support veterinary fellows in a 2 year fellowship in mouse pathology. With stipends in the range of $60,000 per year, it is recommended that 0.5 M per year be invested for the next 5 years.
b) Development of a Mouse Biology Database to provide access to phenotypic information. Data would be entered from the large published literature, and from current large scale programs. The goal would be a database that is organized on the model of the Mouse Genome Database; it was felt that a large effort would be required to initiate this new database, and initial funding at a level of 4 M per year was recommended.
c) Standardized phenotyping methods will be developed at large scale mutagenesis screening centers. It is recommended that supplemental funds be provided to such centers to enable external investigators to access these facilities for analysis of their mutant mice. An supplement of 5 to 10% of a center budget was recommended for this purpose.
d) Development and Distribution of validated arrays for studying gene expression. Gene expression arrays will provide an important new source of phenotypic information for characterizing new mutants. Significant support should be provided towards making these reagents available to the mouse genetics community.
BREAKOUT GROUP H: Mouse Resources
Move toward the broad scale use of cryopreservation for storage and dissemination of strains: Technology development. Investigate the pathogen transmission through sperm as well as the recovery of mice from frozen sperm following artificial insemination (AI), in vitro fertilization (IVF), and intracytoplasmic sperm injection (ICSI). Investigate the effects of strain variation and the effects of mutation(s) [ENU, transgenes, KOs] on recovery. Also, investigate the feasibility of ovary freezing for cryopreservation.
Cost estimate: $1M/lab Two years Two projects
Move to the broad scale use of cryopreservation for storage and dissemination of strains: Outreach. Enable cryopreservation technology to be widely disseminated to the general scientific community. Develop a hands on training course for cryopreservation technology. Also, develop a video/manual describing the details.
Cost estimate: $200K/year 3 training sessions/year/10 students per session
Resource: cryopreserved strains
Develop a storage facility for frozen gametes. Genotype (most/each) strains via gamete analysis. Do extensive quality control on a subset of strains. Develop a cryopreservation laboratory for use by scientists who do not have access to cryopreservation technology. Finally, develop a recovery resource(s) for scientists who do no have access to the technology.
Cost estimate: $1.5M/year 5 years
Resource: Live Strains
Develop a repository(ies) that can accommodate 250 new strains per year. Establish a committee of scientists to evaluate strains for importation. Costs for operating the facility should be paid for from the sale of mice.
Cost estimate: $3M/year 5 years
Database: Disease Models and Phenotypes
Develop a low pass mouse disease models and phenotypes database. Facilitate the expansion of an animal models database that includes not only cancer (already underway) but other disease states and phenotypes. Efforts to make databases interfaceable are imperative (i.e. disease models, molecules, sequences). The initial phase (phase I) should merely provide a "roadmap" or index to other databases containing information about spontaneous models, induced models, etc.
Cost estimate: $300K/year 3 years
Commercial Technology Interactions
For important enabling technologies NIH should explore the feasibility (early on) of establishing a mechanism acceptable to all parties for disseminating the technology(ies) to the general scientific community.
PRIORITY SETTING MEETING FOR MOUSE GENOMICS
AND GENETICS RESOURCES
National Institutes of Health
March 19-21, 1998
|David COX, M.D., Ph.D., Co
Departments of Genetics and Pediatrics
Stanford University School of Medicine
300 Pasteur Drive, M-336
Stanford, CA 94305
TEL: (650) 725-8042
FAX: (650) 725-8058
|William DOVE, Ph.D., Co-Chair
McArdle Laboratory for Cancer Research
Madison Medical School
University of Wisconsin
1400 University Avenue
Madison, WI 53706-1599
TEL: (608) 262-4977
FAX: (608) 262-2824
|Mark ADAMS, Ph.D.
Department of Eukaryotic Genomics
The Institute for Genomic Research
9712 Medical Center Drive
Rockville, MD 20850
TEL: (301) 838-3507
FAX: (301) 838-0208
Kathryn ANDERSON, Ph.D.
Philip AVNER, Ph.D.
Rudi BALLING, Ph.D.
Allan BALMAIN, Ph.D.
Gregory S. BARSH, M.D., Ph.D.
Allan BRADLEY, Ph.D.
Stephen D.M. BROWN, Ph.D.
Maja BUCAN, Ph.D.
Geoff DUYK, M.D., Ph.D.
Chris GOODNOW, BVSc., Ph.D.
Monica JUSTICE, Ph.D.
Life Sciences Division
Oak Ridge National Laboratory
Y-12 Bear Creek Road, Building 9211
Oak Ridge, TN 37831-8080
TEL: (615) 574-0700
FAX: (615) 574-1274
NIH ADVISORY COMMITTEE REPRESENTATIVES
NCI Pre-Clinical Models Group
(See Address Above)
(See Address Above)
(See Address Above)
(See Address Above)
Terry VAN DYKE
(See Address Above)
(See Address Above)
NHGRI Program Planning Subcommittee
Aravinda CHAKRAVARTI, Ph.D., Chair
Co-Chair, Session B: cDNA/ESTs
Department of Genetics
Case Western Reserve University
10900 Euclid Avenue, Room BRB 721
Cleveland, OH 44106
TEL: (216) 368-5847
FAX: (216) 368-5857
Charles H. LANGLEY, Ph.D.
Section of Evolution and Ecology
Center for Population Biology
University of California, Davis
Davis, CA 95616
TEL: (916) 752-4085
FAX: (916) 752-1449
Alan R. WILLIAMSON, Ph.D.
760 Lawrence Avenue
Westfield, NJ 07090
TEL: (732) 232-7728
Barbara WOLD, Ph.D.
(See Address Above)
NICHD Advisory Council
Brigid HOGAN, Ph.D.
(See Address Above)
NIGMS Advisory Council
David A. CLAYTON, Ph.D.
Senior Scientific Officer
Howard Hughes Medical Institute
4000 Jones Bridge Road
Chevy Chase, MD 20815
TEL: (301) 215-807
FAX: (301) 215-8828
Howard Hughes Medical Institute
W. Maxwell COWAN, M.D., Ph.D.
Vice President and Chief Scientific Officer
Howard Hughes Medical Institute
4000 Jones Bridge Road
Chevy Chase, MD 20815
TEL: (301) 215-8803
FAX: (301) 215-8828
David A. CLAYTON, Ph.D.
(See Address Above)
Department of Energy
Ari PATRINOS, Ph.D.
Office of Biological and
US Department of Energy
19901 Germantown Road, ER-70
Germantown, MD 20874-1290
TEL: (301) 903-3251
FAX: (301) 903-5051
Marvin FRAZIER, Ph.D.
Health Effects and Life Sciences
US Department of Energy
19901 Germantown Road, ER-72
Germantown, MD 20874-1290
TEL: (301) 903-5468
FAX: (301) 903-8521
NIH ORGANIZING COMMITTEE
Office of the Director
Harold VARMUS, M.D.
Vida BEAVEN, Ph.D.
National Cancer Institute
Richard KLAUSNER, M.D.
Grace SHEN, Ph.D.
National Human Genome Research Institute
Francis COLLINS, M.D., Ph.D.
Bettie GRAHAM, Ph.D.
Mark GUYER, Ph.D.
Elke JORDAN, Ph.D.
Jane PETERSON, Ph.D.
National Institute of General Medical Sciences
Marvin CASSMAN, Ph.D.
Judith GREENBERG, Ph.D.
Paul WOLFE, Ph.D.
National Institute of Child Health and Human Development
Duane ALEXANDER, M.D.
A. Tyl HEWITT, Ph.D.
Steven KLEIN, Ph.D.
Richard TASCA, Ph.D.
Michael WHALIN, Ph.D.
NIH STAFF ASSISTANTS
Ms. Anita Allen
National Human Genome Research Institute
Ms. Stephanie Walker
National Human Genome Research Institute
REPORT FINALIZED JUNE 11, 1998
1 The original cost calculations for the structural analysis did not include the $6.0 million for sequencing.
2The original cost calculations did not take into consideration $6.0 million for sequencing. On 6/22 the table was revised to make it consistent with the meeting summary.