Report on Fifth Malaria genome sequencing meeting ,
June 30th-July 1st 1998, Hinxton, Cambridge, UK

For subsequent meeting reports please refer to MFI's Malaria Genome Database page at

Background and Aims of meeting

The Plasmodium falciparum genome project is a coordinated effort by funding agencies, sequencing centres and malariologists to achieve the complete sequencing of the P. falciparum genome (clone 3D7) and to promote its use in developing new strategies to control malaria, including diagnostics, vaccines, and drugs.

The Hinxton workshop-style meeting aimed to






JULY 1998





shotgun complete

in closure




gap closing/finished

finished (published)




gap closing





shotgun complete

in closure





in shotgun




libraries under construction

in shotgun




libraries under construction

in shotgun





shotgun completed




shotgun started

shotgun completed





in closure




SESSION I Updates from funders (see table for allocation of chromosomes)

Michael Gottlieb (NIAID) reported that the NIH was currently expanding its genomic efforts generally, including malaria, with a focus on postgenomics and functional genomics. NIH funds sequencing efforts on P. falciparum chromosomes 2, 10 and 11 at TIGR, and supports John Dame for obtaining EST’s and GST’s from P. vivax and P. berghei.

Martha Peck (Burroughs Wellcome Fund) reported the increasing emphasis of the Fund on use of the genomic information. The BWF funds the sequencing of chromsome 12 (via Stanford ) and the sequencing of chromosome 14 (via TIGR). In addition the BWF funds optical mapping of the complete P. falciparum genome.

Dan Carucci (NMRI) reported that US Dept of Defense funds allocated to Plasmodium vivax will be refocussed into the P. falciparum effort, including bioinformatics . The Dept of Defense has allocated funds for the sequencing efforts at TIGR in collaboration with support from NIAID and the Burroughs Wellcome Fund.

Cathy Fletcher (Wellcome Trust) reported that the Trust funds sequencing of up to half the P. falciparum genome (Chromosomes 1,3,4, 5-9, and 13) and currently coordinates the Multilateral Initiative in Malaria, which gives high priority to ensuring that knowledge from sequencing the Plasmodium genome is applied to the discovery of new drugs and vaccines. Recently the Trust established Beowulf Genomics, which is funding the sequencing of various pathogen genomes and pilot projects on Trypanosoma brucei and Leishmania major.

Rob Ridley (WHO) reported that although WHO does not fund the P. falciparum genome project, its TDR programme is interested in promoting the use of genomic information for strategic research for drug and vaccine development. WHO-TDR does provide catalytic funding for sequencing of other parasites relevant to its mission, and promotes links between agencies.

Udates from chromosome-specific projects

1. TIGR/NIMR - Chromosome 2

Malcolm Gardner said that since early 1998 TIGR had been collaborating extensively with NCBI on annotation of chromosome 2 data.

Herve Tettelin described the progress on chromosome 2. Efforts since the Orlando meeting had been directed at closing the A-T rich sequence gaps. The sequence is now completely annotated and edited, and a manuscript had been submitted to Science four days before the meeting. Two small sequence gaps still remain in the chromosome, at 10 kb from either end.

The 945kb chromosome contains a total of 209 genes, of which 36 percent are genes for the cell envelope and 27 percent are for hypothetical proteins. There was 4.3 percent difference between expected sequence fragments and the optical map constructed by David Schwartz.

Note: a need for more full-length EST’s to validate gene predictions was identified.

Herve Tettelin presented his analysis of the predicted ORFs on chromosome 2 and compared the distribution of some gene families with those on yeast chromosome 3. (e.g. those for secreted proteins, integral membrane proteins). He described some prominent features of the chromosome 2 genes. The success of the chromosome 2 effort was attributed to improved methodology for high-throughput sequencing, development of optimal chemistry, modifications to TIGR assembler software, use of optical mapping, and development of Glimmer M software ( a program for gene prediction).

Eugene Koonin (NCBI) presented his analysis which suggests that many of the predicted proteins encoded by chromosome 2 contain large nonglobular domains, to a much greater degree than in yeast. He speculated they may have accumulated due to positive selection by the immune system.

2. Sanger -Chromsome 3

Sharen Bowman reported that gap closure on chromosome 3 was still underway. Forty-two small gaps remain, typically 1-2 kb. Three gaps are covered by pUC clones and 33 by combinatorial PCR, with 6 unfilled gaps. The total contig size was 961 kb .

Transposon libraries have been the most successful method for filling problem gaps. The Sanger team is confident the remaining gaps can soon be filled.

Chromsosme 4 - Shotgun phase is now finished and gap-filling is at an early stage, with about 200 contigs.

Chromosome 1- Shotgun now finished, 7 YAC clones received . The Newbold lab are generating new YAC clones.

Chromosomes 5,6,9- Hopefully these will separate out from the blob; chromosomes 7 & 8 don’t separate.

Chromosome 13- Shotgun started, Sequence available for 7 YAC clones

Dan Lawson reported that on Chromosome 3 the predicted total gene number is 196, a gene density of one per 5 kb. He estimated the gene complement for the whole P f alciparum genome at 5,500 to 6,000, and emphasized that completely different genes can be located very close together. The largest number of introns predicted to be in a single gene was about 7, although most genes appear to have 1 or 2 . He considered there were 4 families of subtelomeric genes and estimated that about 40-50 var genes are present in the whole genome.

Dan described development of the Sanger Centre website and the MALPEP database of predicted malarial proteins. The P.falcip Gene search and P.falcip RegEx server are being developed but are not yet online He identified future tasks as:


Note- Glimmer M is available from TIGR for gene prediction.

3. Stanford - Chromosome 12

Richard Hyman summarized progress, which was detailed in an accompanying handout. Stanford will release data from the Bins soon. It would be useful if all speakers in future provide Handouts BEFORE OR AT THE MEETING.

Richard proposed an experiment in annotation at Stanford whereby open annotation by anyone would be invited, to be edited by himself. The name and affiliation of the annotator would be submitted to Genbank.

SESSION II Updates on technology and mapping projects

Dyann Wirth (Harvard) described progress at Harvard in identifying E coli strains defective in DNA repair that may be suitable for stable cloning of P falciparum DNA. By screening about 200 strains they have identified the SRB strain as suitable. This is commercially available from Stratagene and is very similar to the SURE strain, which is claimed to differ only in lack of the recB gene. Its tranfection frequency is quite low.

David Kemp ( Darwin) reported that his lab has mapped about 3/4 of the total genome. They have located a gene coding for cytoadherence at the right-hand end of chromosome 9, and designated it Clag (Cytoadherence linked asexual gene). Clag is expressed in blood stages, probably as a membrane protein. Tansfection of 3D7 P. falciparum with an antisense construct gave increased cytoadherence to melanoma cells. The sequence of Clag has homologies on chromosomes 2, 3, and 4 . Kemp hypothesizes that the Clag gene family is essential for cytoadherence, and possibly represents a drug target.

Note: David Kemp will send sequence of Clag to TIGR and Sanger to identify

David Schwartz (NY University) described the optical mapping technique in which fluorescence intensity of DNA molecules cut with restriction enzymes is related to base composition. The optical map of P. falciparum chromosome 2 has given quite a good match with the sequence, and helped reduce finishing time. A whole-genome map of P f alciparum is currently under construction. Using BamH1 twelve whole chromosome contigs have been assembled out of 14; using Nhe1 thirteen whole chromosome contigs are currently assembled.

Note: Dyann Wirth suggested possibly comparing maps of different P. falciparum isolates.

Michael Ferdig (NIH) described the construction of a genetic linkage map of P falciparum using a Pf Hb3 x Dd12 cross. By using genetic linkage analysis genes involved in transmission, drug resistance, pathogenesis, immunity, development etc can be identified. He emphasized that the genetic map benefits from the genome project but is an entirely separate way of leading to gene identification. Currently about 600 markers are available.

Note: More crosses and markers are needed

Wednesday 1 July SESSION III Lessons from other genome projects

Steve Oliver (UMIST) described the EUROFAN Saccharomyces cerevisiae sequencing project funded by the European Commission , and pointed out similarities to P falciparum in terms of real or apparent high redundancy. In phase I of EUROFAN ( 4 years ) about 200 labs were involved. Phase II of the programme involves an international consortium which has divided up the yeast genome between labs worldwide to undertake a systematic functional analysis. The first step is making tagged deletions, then the mutants will be characterised phenotypically.

Application of Microarrays

Michael Campbell (Stanford) described the microarray technology used with yeast - to investigate the differential expression of individual genes during the cell cycle.

John Quackenbush (TIGR) described microarray technology usage in analyis of tumor-specific gene expression.

Dan Carucci described some problems specific to Plasmodium for applying microarray technology. These included the complex life cycle and the problem of obtaining sufficient parasite material for some stages (e.g. sporozoites). He described a pilot project at NMRI/TIGR using highly synchronized blood-stage P falciparum. Cy-3 labelled schzont cDNA was hybridized to a 1536-element array from chromosome 2. Questions facing use of microarray technology for Plasmodium include sensitivity, which stages to use, what’s the best method for producing DNA slides .

SESSION V Discussion on making the data useful to the malaria community Lead Discussants: Chris Newbold/Dan Carucci/Alan Fairlamb

The following points came out of the discussion:



  1. A single site should be established to include
    BLAST data
    EST/GST data
    shotgun data
    known genes with databank
  2. Site should be mirrored for others, with email servers
  3. Site should have links to other Apicomplexan databases. Develop searching/comparative methods
  4. Long-term curatorship is needed. Initially one person is needed, expert in bioinformatics. Possibly more people needed later.
  5. Training in data use needed- an online tutorial could be designed
  6. Nomenclature of genes needs to be standardized

SESSION IV Database Development

Eugene Koonin described experiences of other microbial genomes, including data on protein fold recognition patterns, and development of systems that allow functional characterization

Martin Aslett described his work at EBI curating Brugia malayi, Schistosome, and Trypanosoma cruzi genome projects . He identified needs to maintain analysis once the sequencing centres had finished, to liaise with major bioinformatics centres, and to put a dedicated person in place early on in the project.

Victoria McGovern described plans to hold a computer tutorial in accessing the malaria genome data at the ASTMH meeting in Puerto Rico on 18-22 October 98.

Note: Names of people able to help out with the demonstration (e.g. postdocs, graduate students) should be forwarded to Victoria ASAP.


Many aspects had been covered in the previous two sessions, but points were made that it would be useful to be able to track publications relevant to the malaria genome in NCBI tools such as Medline. Journal editors should require authors to quote accession numbers .


Where do we go from here?

Using genome information for development of malaria vaccines

Dan Carucci (in lieu of Steve Hoffman) described NMRI’s approach to using the genome data to develop a set of DNA vaccines to elicit protective immunity in humans. Funding of a pilot project is currently being arranged.

Malcolm Gardner put a brief case for doing a second-generation shotgun on P. vivax, assuming that cost reductions and higher throughput may become available as a result of the new generation of sequencing equipment.

Chromosome-specific goals for the next meeting : See table at beginning of report


Date of next meeting

The proposal to change the format for future meetings of the consortium was discussed. Funders proposed the next meeting should be smaller, for sequencers and funders, to be held at Chantilly near Washington DC on 29 January 1999. A larger meeting of all people involved in the consortium and related research could be held in UK in summer 1999. In addition, a series of 4 workshops was proposed, focusing on four aspects of the malaria genome project:

      1. database development
      2. microarray technology
      3. development of molecular tools
      4. use of genomic information for drug discovery and vaccines

The meeting reconvened after lunch to discuss the workshop proposal and it was generally felt that workshops or focus groups on the first three topics would be useful. A workshop on drug discovery and vaccine developmnet was felt to be premature.

Note : Funders will draft letters of invitation to individuals identified by them (in a follow-up conference call) as likely members of the focus groups. BWF to initiate arrangements for next meeting to be held at Hilton Head. Wellcome Trust to book Hinxton Conference Centre for July 1999.