Community Briefing on Mouse Genome Sequencing
On February 1, 1999, the staff of the National Human Genome
Research Institute (NHGRI) held a briefing for prospective applicants
who were interested in responding to RFA HG-99-001, Mouse Genome
Sequencing Network. This report summarizes the issues discussed
at that briefing.
NHGRI Briefing on RFA HG-99-001
Network for Large-Scale Sequencing of the Mouse Genome
On February 1, 1999, the National Human Genome Research Institute
(NHGRI) held a briefing on the initiation by NIH of a program to
sequence the mouse genome (this briefing had been announced in RFA
HG-99-001). Dr. Francis Collins, Director of NHGRI opened the briefing
with some background information, Dr. Jane Peterson reviewed the
RFA and Dr. Bettie Graham discussed a plan that NHGRI is developing
to prioritize biologically interesting regions of the mouse genome
for sequencing. The floor was then opened for questions and discussion.
The following issues were discussed:
- The production of finished sequence was not adequately discussed
in the RFA. NHGRI will publish an update shortly to clarify two
- Sequencers will be responsible to eventually finish any
clone they start.
- The long-term objective of this program is to generate a
highly accurate, complete, finished sequence of mouse DNA.
The purpose of the cooperative agreements that will be funded
will be to support the establishment of sequencing facilities
capable of making a significant contribution to this objective.
During the third year of the project, there will be a stringent
competitive review that will evaluate whether the group has
established a process that will be sufficiently robust to
be scaled for completing the sequence of the entire mouse
genome; toward the end of year 2, there will be a review that
will assess the group's progress. Therefore, the applicant
must propose a plan that will demonstrate the center's capability
to finish data in an efficient and cost effective way. In
the first year of support, a center should finish at least
two BAC clones (the minimum number required for quality assessment).
- In addition to producing a finished sequence of
the mouse by 2005, a nearer term goal of the program is to produce
enough sequence data by 2003 to generate complete coverage of
the genome in sequence of "working draft" quality. Working
draft sequence is a concept that has been discussed widely in
the genomics community in the past several months; however, NHGRI
recognizes that the definition of working draft has never been
crisp and is still confusing to some investigators. The NHGRI
Five-Year Plan defines working draft as "a product
covers most of the region of interest but may still contain gaps
and ambiguities." Thus, working draft is, at a minimum, that
level of shotgun coverage of the genome at which contigs begin
to coalesce. The amount of shotgun data needed to achieve this
benchmark will differ depending upon the particular sequencing
strategy chosen. Furthermore, different strategies may choose
to pause between working draft coverage and finished sequence
at different levels of shotgun coverage. Therefore, NHGRI does
not specify numerically a shotgun depth that should be proposed
for a center's working draft. Rather the application should address
both goals of the RFA (coverage in working draft by 2003 and finishing
the genome by 2005), taking into account the issues involved in
later finishing a clone that was initially sequenced to low redundancy.
- The C57BL6/J BAC library mentioned in the RFA (which
is derived from a female) is available from Dr. Pieter de Jong
at Roswell Park Cancer Institute (http://bacpac.med.buffalo.edu/).
It is currently an 11-fold genomic equivalent library and an additional
library is being constructed from a male of the same strain that
will add an additional ten-fold worth of clones, bringing the
overall depth of available clone coverage to about 20 genomic
equivalents. If additional libraries are needed (for example a
male library), NHGRI will fund the characterization of that library
through a competitive supplemental mechanism. Requests for funds
to generate and/or characterize additional libraries beyond the
one currently available should not be included in applications
submitted in response to RFA HG-99-001.
- The RFA defines an existing sequencing center to
be one that has deposited 2 Mb or more of sequence in a public
database. This amount of sequence need not be in a single contig.
If the group's sequence product is not genomic sequence, the investigator
is urged to speak with NHGRI staff to discuss whether to apply
as an existing center or as a new center. NHGRI did reaffirm its
strong intention to use the funds available for supporting new
sequencing groups as well as expanding existing efforts.
- The January 8, 1999 meeting at Princeton was not
organized by NHGRI, but by members of the mouse community. The
report of the meeting should, therefore, not be considered to
be an NHGRI policy document or a guideline for this RFA. Rather
it is a summary of the most recent discussion of genomic sequencing
issues by the mouse community. For example, there are several
specific recommendations contained in the report that, while perhaps
reasonable, do not define the scope for this RFA.
- NHGRI recognizes that during the initial year of
funding, the fingerprint and BAC end sequence data will not be
available to assist in choosing clones for sequencing. During
this period, sequencers pursuing the sequence first, map second
strategy will have to choose clones at random, not knowing whether
they overlap with other clones being sequenced.
- The previous NHGRI policy of allowing human DNA
large-scale sequencing labs to devote up to 10% of their effort
to sequencing mouse DNA was articulated prior to this initiative.
Now that there is a specific program for mouse genomic sequencing,
all NHGRI support for sequencing the mouse genome will be provided
through it, and its peer review process.
- Dr. Greg Schuler of the National Center for Biotechnology
Information, described plans for constructing the central server
that will be used for mouse genomic sequencing. Specific recommendations
as to data items that will be needed on the server may be found
in the report from the Princeton meeting. Dr. Schuler said that
he expects the central server to be available by the end of the