sander@ankh-morpork.hacktic.nl (Sander Plomp) wrote:
Doesn't this mean everyone can detect that data is hidden by decompressing and recompressing? If the recompressed file is smaller, you know data was hidden and it can be extracted using gunzip -s.
In contrast, data hidden in the LSB of sound samples or pictures cannot be detected. The reason seems to be that gzip compression is non-lossy, while most stego-tricks work by introducing a sight amount of noise-like `damage' to the data used as hiding place. You need to loose a bit information to make room for the secret data.
So it's a nice idea but it doesn't really work....
Actually it's not quite so simple to detect. gzip lets you specify the level of compression that you want to use. So simply uncompressing and recompressing it won't necessarily give you the same size file unless you happened to specify the same compression level. Compression levels might also be affected depending on what version was used to compress it. You could probably detect it by looking for nonuniform compression in the file, but you'd have to write a special program to do that. In any case, it's not so simple as just decompressing and recompressing. A better method of hiding data would be this: In normal compression, when a duplicate string is found in the data, it is replaced with a pointer to the last occurance. However, if there is a string with two pervious occurances, within a short enough distance, the offset could be set to point to either one. As long as the offsets aren't too far apart, using one doesn't take any more space than using the other. In this way, data can be hidden without making the compressed file any larger. Of course, it could still be detected because gzip doesn't normally compress that way, but the person looking for the data would need special software to do it.