For those who only look at the first screenful, a place to go for fairly current details on gene sequencing is: Hillis, David M. and Moritz, Craig, eds. 1990. _Molecular Systematics_ Sinauer: Sunderland, MA. The most convenient way of keeping DNA is dried. That, as I understand it, is what the military are trying to do. The idea isn't, yet at least, actually to sequence it. You don't need a sequence for unambiguous identification. The gimmick is RFLP: restriction fragment length polymorphism. You take a DNA sample (in solution) from the unknown: say skeletal remains that might be those of some MIA. You expose that to enzymes that cut DNA in specific locations depending on the DNA base-pair sequence of the strands. These enzymes are called restriction endonucleases -- hence the name of the technique. Depending entirely on the DNA sequence, the sample will get cut in a bunch of places giving a bunch of DNA scraps of various different lengths. You can get chunks of different sizes to separate out by speed of movement through a gel under an electric field. According to preference, you can then use either a stain or radioactive markers to tell where in the gel the DNA fragments are. If the pattern of fragment migration is the same between the known and unknown, you can now fit a name to the bones. But, if the patterns aren't the same, the DNA sequences the restriction enzymes looked for weren't in the same places in the two samples. That means they couldn't have come from the same person. This is a bit of an oversimplification. A lot of human DNA has its restriction sites in the same places you'd find in apes, never mind other humans. Total DNA similarity between humans and chimps is better than 90% overall. Specific zones, called hypervariable sequences, are the only ones really useful for individual ID by DNA. It also works very well for parentage analysis. So you might be able to identify an unknown sample without a previous reference from that person if you could still get samples from that individual's parents. On Mon, 13 Jun 1994, Mats Bergstrom wrote:
countries. These samples are usually frozen and saved for decades (for the purpose of comparison if the individual should fall ill; and for research if something might get interesting) at most laboratries. DNA- analysis efter thawing is no big deal with modern techniques. So if one
The point I got a chuckle out of was the notion of freezing blood samples as a routine thing. To get much use at a molecular level (either DNA or protein structure) out of frozen samples over the long term (more than weeks) you have to keep it at -70C or better. People who study DNA are utterly paranoid about freezer failure. If they leave town, they may leave the cat with an automatic feeder but they need someone to visit the freezer once or twice a day and make sure it's okay. If building power fails (not that uncommon in old university science buildings) you need a generator or a quick load of liquid nitrogen to keep your frozen treasure from being ruined. If drying works, that's what will be used. I don't know, not being in that specialty myself, how good the preservation quality of dry-stored DNA really is. I can easily imagine it being good enough for actual sequencing if it had been quickly freeze-dried and stored under nitrogen instead of air. I'm not sure of that, though, and if preservation isn't perfect sequencing could become a problem without making identification impossible. DNA is terribly sensitive to all kinds of damage, and enzymes already present in the blood or tissue will tear it up given half a chance. Re genomic analysis: yes, it's certainly true that DNA sequencing is doable at the moment on the scales the human genome would require, in the same sense that space flight was doable in the fifties. It's logical to predict that it will only get easier as automatic sequencers get better. The closest tome I happened to grab quotes the length of the human genome at about 2.9 x 10^9 base pairs. The fact that there are four possible bases (2 bits) gives you a 5.8 billion bit storage issue. Not that intractable for storage and analysis, especially given that some compression technques that wouldn't work well for most data would be applicable. James Hicks comments -
"Single Cell" polymerase chain reaction (PCR) is being done in the lab now. Theoretically all you need is one cell and you can amplify any DNA sequence from the genome that you want.
PCR makes tiny sample sizes a lot less of a problem than they used to be, but it has the same problems any extremely sensitive amplifier does. It amplifies everything. If there's the least contamination of the sample with any other DNA, the analyst is in trouble. Suppose you vaccuum a chair. You get some skin from me, some skin from N other people, umpteen dust mites and the foot of a crushed roach. Given the way the enzymes in the dead cells would have torn up the DNA, you may get nothing but if you get anything, the bugs win. Research labs have had terrible trouble with contamination - some PCR amplified "human" DNA in the big databases turns out to look suspiciously like yeast. and //mb adds -
the streets for saliva every morning at 3am and whipping the flesh of all offenders.
Saliva would give the same problem. Nobody's mouth is sterile, and my normal bacterial flora is a lot better protected against the digestive enzymes in saliva than shed cells from my mouth are. Given all that, if anyone is still awake, it's the step *after* all the sequencing that's the biggie... at least for anything beyond simple ID. You've got a sequence: what does it do? A lot of the time, nothing. Lots of animal DNA doesn't ever get used for anything obvious and seems to be along for the ride. You have to distinguish live data from red herrings. Then if you're looking for genetic predictors of disease, you can't just say that *any* change in a particular gene is a red flag -- there's a lot of function-neutral variation. You'd be denying insurance coverage to very safe risks and losing money. But when a change is *not* function-neutral, it may only take one base-pair change. Sickle-cell anemia is produced by just one "typo". What makes it even harder is that most genetic predispositions to disease probably aren't single, consistent, easy to spot changes. A lot of the ones we know about are, but only because those are the ones it's easy to find. Considering that interaction effects really aren't well studied even in pharmacology where they've been known longer (What happens when somebody mixes prozac with alcohol and marijuana? The last time I checked Medline nobody had looked.) I think it will take a long time to sort out problems that have something to do with several genes plus an environmental trigger. The problem may not be big enough to be formally called intractable, in the cryptographic sense, particularly if one makes the customary (sensible) assumptions about processing power increases, but it still looks big enough to be interesting. Sequencing is necessary for some of the 1984ish outcomes predicted, but not sufficient. Conversely you can do a lot of unpleasant discriminatory things to people on the insurance front without knowing their DNA sequence -- Down's Syndrome is extremely obvious and a clear indicator of a bunch of expensive problems not to mention an early death. It looks to me like the issue is worth keeping an eye on, but contagious diseases in the waiting room are still a better justification for avoiding the medical profession than a DNA registry is. regards... -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=- Dept. of Biology Jennifer Mansfield-Jones University of Michigan cardtris@umich.edu