Harbor, Venter met Collins, between flights, at the Red Carpet Lounge at Dulles Airport. Celera was about to launch an unprecedented push to sequence the human genome using shotgun sequencing, Venter announced matter-of-factly. It had bought two hundred of the most sophisticated sequencing machines, and was prepared to run them to the ground to finish the sequence in record time. Venter agreed to make much of the information available as a public resource—but with a menacing clause: Celera would seek to patent the three hundred most important genes that might act as targets for drugs for diseases such as breast cancer, schizophrenia, and diabetes. He laid out an ambitious timeline. Celera hoped to have the whole human genome assembled by 2001, beating the projected deadline for the publicly funded Human Genome Project by four years. He got up abruptly and caught the next flight to California.
Stung into action, Collins and Lander rapidly reorganized the public effort. They threw open the sluices of federal funding, sending $60 million in sequencing grants to seven American centers. Maynard Olson, a yeast geneticist from Berkeley, and Robert Waterston, a former worm biologist and now a gene-sequencing expert from Washington University, provided key strategic advice. Losing the genome to a private company would be a monumental embarrassment for the Genome Project. As knowledge of the looming public-private rivalry spread, newspapers were awash with speculation. On May 12, 1998, the Washington Post announced, “Private Firm Aims to Beat Government to Gene Map.”
In December 1998, the Worm Genome Project scored a decisive victory. From the gene-sequencing facility in Hinxton, near Cambridge, England, John Sulston brought news that the worm (C. elegans) genome had been completely sequenced using the clone-by-clone approach favored by proponents of the Human Genome Project.
If the Haemophilus genome had nearly brought geneticists to their knees with amazement and wonder in 1995, then the worm genome—the first complete sequence of a multicellular organism—demanded a full-fledged genuflection. Worms are vastly more complex than Haemophilus—and vastly more similar to humans. They have mouths, guts, muscles, a nervous system—and even a rudimentary brain. They touch; they feel; they move. They turn their heads away from noxious stimuli. They socialize. Perhaps they register something akin to worm anxiety when their food runs out. Perhaps they feel a fleeting pulse of joy when they mate.
C. elegans was found to have 18,891 genes.III Thirty-six percent of the encoded proteins were similar to proteins found in humans. The rest—about 10,000—had no known similarities to known human genes; these 10,000 genes were either unique to worms, or, much more likely, a potent reminder of how little humans knew of human genes (many of these genes would, indeed, later be found to have human counterparts). Notably, only 10 percent of the encoded genes were similar to genes found in bacteria. Ninety percent of the nematode genome was dedicated to the unique complexities of organism building—demonstrating, yet again, the fierce starburst of evolutionary innovation that had forged multicellular creatures out of single-celled ancestors several million years ago.
As was the case with human genes, a single worm gene could have multiple functions. A gene called ceh-13, for instance, organizes the location of cells in the developing nervous system, allows the cells to migrate to the anterior parts of the worm’s anatomy, and ensures that the vulva of the worm is appropriately created. And conversely, a single “function” might be specified by multiple genes: the creation of a mouth in worms requires the coordinated function of multiple genes.
The discovery of ten thousand new proteins, with more than ten thousand new functions, would have amply justified the novelty of the project—yet the most surprising feature of the worm genome was not protein-encoding genes, but the number of genes that made RNA messages, but no protein. These genes—called “noncoding” (because they do not encode proteins)—were scattered through the genome, but they clustered on certain chromosomes. There were hundreds of them, perhaps thousands. Some noncoding genes were of known function: the ribosome, the giant intracellular machine that makes proteins, contains specialized RNA molecules that assist in the manufacture of proteins. Other noncoding genes were eventually found to encode small RNAs—called micro-RNAs—which regulate genes with incredible specificity. But many of these genes were mysterious and ill defined. They were not dark matter, but shadow matter, of the genome—visible to geneticists, yet unknown in function or significance.
What is a gene, then? When Mendel discovered the “gene” in 1865, he knew it only as an abstract phenomenon: a discrete determinant, transmitted intact across