Re:Bandwidth limitations, DNA binary coding
-----BEGIN PGP SIGNED MESSAGE----- Matthew J Ghio writes:
Um, minor correction: There are four base pair combinations, and each can be represented by two bits.
There are four base pair combinations, but HUGO (Human Genome Organization) has elected to use 15 letter symbols in it's representation of the genome coding sequence (X is any base, for instance). 15 symbols, 1 byte.
Frankly, if I had the kind of technology to easily sequence my entire genome, I doubt I'd be content to just look at it. I'd probably be saying, "Hmm.. I don't like that gene, it might give me heart disease, I'll just use a modified retrovirus to substitute a better one..." :)
Lee Hood is working on the technology (PCR, for which Kary Mullins just won a Nobel Prize will help) for sequencing large amounts of DNA code, granted it's 15 years away at least, but just wait. Also, Have you been reading French Anderson in the New York Times? As one of the people who helped design the "modified retrovirus" of which you speak (Retroviral Expression Vector, N2), I can tell you that they work great in cells that live in a dish and lousy in a whole organisms, don't trust your heart to them. We can, however, engineer you in other ways. Scott G. Morham ! The First, Vaccinia@uncvx1.oit.unc.edu ! Second PGP Public Key by Request ! and Third Levels ! of Information Storage and Retrieval ! DNA, ! Biological Neural Nets, ! Cyberspace -----BEGIN PGP SIGNATURE----- Version: 2.3a iQCVAgUBLOMCRz2paOMjHHAhAQG62wQAg2fHDYdhJADiz5KdEMTDCLg74IZ9onBQ TCrQcuFdiWBlB+Wt970a8zmur8Js5NdskpKYMiDCz6BKqEP1t17ZWPCL1lliTsPF gtikx9dTsCRiWbWKUzPPfiEDXDGO/GovuLVbC98dOyJrTVBjrBHsJtuXL21S/R+n 74C/S2k4o74= =jI3W -----END PGP SIGNATURE-----
VACCINIA@uncvx1.oit.unc.edu says:
There are four base pair combinations, but HUGO (Human Genome Organization) has elected to use 15 letter symbols in it's representation of the genome coding sequence (X is any base, for instance). 15 symbols, 1 byte.
15 symbols, HALF a byte (actually a touch less.) One nybble can express 16 possible symbols (or one Hex digit, or whatever.) Plus, of course, the genome is highly compressable -- lots of repeated sequences, especially in interons. Perry
participants (2)
-
Perry E. Metzger -
VACCINIA@UNCVX1.OIT.UNC.EDU