Meeting of the Malaria Genome Consortium
For subsequent meeting reports please refer to MFI's Malaria Genome Database page at www.malaria.org/genome.html
Chantilly, Virginia, January 29, 1999
Martha Peck reminded the group that this project is now 32 months old, with sequencing starting a year later. In a very limited time, the project has come to a good critical mass, and has healthy competition and good collegiality between the groups involved. BWF has invested approximately $5.6 million to date. Since the sequencing component is making steady progress, the current challenge for the community and funders is to consider ìfunctional genomicsî and what is needed to foster development of the next stage of the project. NIAID currently has more than 2 dozen genome projects in the works and is looking at the kinds of support that they do. What kind of resources need to be provided to the community to do functional genomics, etc. The malaria genome project itself is looking toward the future, and has funded three workshops in the last few months to begin looking at ìnext stepsî for making the genome useful to the malaria community.
The Department of Defense is now in the second year of a five year funding cycle. They are spending about $1M per year for sequencing at TIGR (with NMRC) and are also investing a smaller amount in projects related to sequencing at NMRC. The Wellcome Trust has committed A 7M to the project including investments in some equipment at Sanger that is also being used for other genome projects. The Trust is organizing a small workshop on malaria postgenomics for the European and Australian malaria research communities on February 12, 1999.
Stanford has completed the shotgun sequencing of chromosome 12 and has established YAC bins. Their strategy is to do low coverage sequencing of YACS in the chromosomeís tiling path. Sequencing of an M13 phage library was done to greater than 9 x coverage, and a pUC plasmid library has been sequenced to 2 x coverage, so there is about 11 x coverage all together now. They expect that the true coverage is a bit less, maybe 9 x, since 20% contamination from other chromosomes is expected. 69% of the sequence from the M13 library has been put in bins and 57% have been assembled into ìgoodî contigs. Good contigs are contigs that have at least to traces from YACS and at least 2 from chromosome 12. They have recently done a big assembly of -all- the chromosome 12 traces and have come up with 728 contigs of at least 1500 bases for a total estimated size of 3.6 Mb. A more rigorous assembly, using contigs with at least 2 YAC traces to eliminate some contamination yields 185 contigs and a size of 2.1 Mb. Stanford expects that there will be plenty of gap closure and finishing still to be done, but that chromosome 12 will be put out for community use by this summer.
Stanford is using GlimmerM, TIGRs malaria-trained annotation program, and ADAPT, an annotation program developed at Stanford for the Arabidopsis genome, to make a first pass at annotation. Their plan is to put the sequence and annotation on their web site, and invite the community to add their own annotation to the emerging annotated map.
Sangerís chromosome 3 still has 2 gaps. The largest is 8.5 kb. They are trying to get sequenceable material that fills the gap and have recently made a library of a partial restriction digest that they hope will do the trick. On chromosome 4, Sanger now has 123 contigs of over 1 kilobases, not including the telomeres. There are still gaps that need to be done, but finishers are now getting geared up to start PCRing across these. The YAC map for chromosome 1 has several gaps and the shotgun sequencing is still in progress. Efforts are now underway to build up the chromosome 1 contigs before going to combinatorial PCR. Sanger is also tackling ìthe blobî, chromosomes 5, 6, 7, and 8. Jenny Thompson has recently produced a good YAC tiling path for chromosome 5. The libraries for 6, 7 and 8 are now being tested. Chromosomes 7 and 8 are still not being well separated by pulse field. Work on chromosome 9 has begun. D. Holt has provided a YAC tiling path, but there is not yet a good chromosomal library. Holt has also provided a set of YACs from chromosome 13. The chromosome 13 YACs havenít yet been put together well enough to provide a good tiling path, so work continues toward getting this chromosome more fully underway.
Data release from Stanford includes FASTA files available for FTP, blast searchable data on the web, and submission of sequences to the public databases.
Leda Cummings has started work on sequencing chromosomes 11 and 10. A library for 11 is being made with the well-separated chromosomes provided by Dan Carucci. Chromosome 10 is still not separating well from chromosome 9. Chromosomes 10 and 11 will not be build on YACs: instead, David Schwartzí optical maps will be used to authenticate DNA assembly
Chromosome 2 is complete and has been published in Science (Science 1998 November 6; 282: 1126-1132). Malcolm Gardner is currently focusing on chromosome 14. The random sequencing phase has been completed and an assembly done in December 1998 has yielded 1750 contigs of lengths up to 99 kb. The average contig length is 7.4 kb. There were 394 sequence gaps and 62 physical gaps in this first assembly, but it is expected that 40-50% of the gaps will disappear after editing for unreliable sequenced ends and other problems. There is a new blast server for chromosome 14 that will search for queried sequences and return the requested bases as well as 1 kilobase on either side.
Chromosome Expected progress by July 1999/Expected finish
Three meetings focusing on expression technologies, database development and development of genetic tools were held in late 1998 and early 1999 as an initial step toward considering how the malaria community can best make use of the genome data. Steve Hoffman will plan a similar meeting on vaccine development in 1999. Summaries of this series of meetings have been distributed by on the malaria genome mailing list.
Data use/data release
The data release policy was discussed again. The community requested, and TIGRs President/Director Claire Fraser agreed, not to attach in the future a TIGR licensing agreement to the Plasmodium data generated there. (Licensing agreements are currently in place at other TIGR pathogen sequencing sites.) It was decided that the current policy for data release would remain in effect, and that a greater effort should be made to communicate with the malaria community the benefits of collaboration with the sequencing centers and the need for acknowledgement of the sequencers efforts.
The next meeting of the genome consortium washeld July 21-23, 1999 at Hinxton Hall in the U.K., hosted by the Wellcome Trust.