Dictionary searching code

Does anyone have some code that will search a dictionary, and tell me *quickly* if an arbitrary chunk of text is in the dictionary? Pre-indexing steps are fine, as is using big chunks of disk for hash tables. The point of course, is to check arbitrary possible plaintext that a test decryption produces. Adam -- "It is seldom that liberty of any kind is lost all at once." -Hume

On Fri, 19 Apr 1996, Adam Shostack wrote:
Does anyone have some code that will search a dictionary, and tell me *quickly* if an arbitrary chunk of text is in the dictionary? Pre-indexing steps are fine, as is using big chunks of disk for hash tables. The point of course, is to check arbitrary possible plaintext that a test decryption produces.
You could try using isite (see http://www.cnidr.org/), which is a pretty cool search engine, and should work well enough, and the patrie structure could make restarts really fast . The real answer to your question depends almost entirely on the machine you wish to run it on- is memory not a problem? If so, tries may be your best bet, though you may have bad cache interactions. Otherwise, you might be best going for a probabalistic approach and using hash table to elimate definite non-matches, then an AVL-Tree or similar for confirmation. If you just use a single bit for each hash-table datum, you can afford to make the table pretty sparse Simon --- They say in online country So which side are you on boys There is no middle way Which side are you on You'll either be a Usenet man Which side are you on boys Or a thug for the CDA Which side are you on? National Union of Computer Operatives; Hackers, local 37 APL-CPIO

I have some anagram code that could be easily adapted to do what you say. Basically, it will find any anagram of a word exists in a dictionary. This means you can query an arbitrarily large dictionary at
100 words per second.
Actually, now that I think about it, it takes 2 seeks, but you could remove one of them if you were doing a lot of queries. (i.e. 1+n seeks for n=number of words.) Look at ftp://vivarin.res.cmu.edu/pub/scram mike

On Fri, 19 Apr 1996, Adam Shostack wrote:
Does anyone have some code that will search a dictionary, and tell me *quickly* if an arbitrary chunk of text is in the dictionary? Pre-indexing steps are fine, as is using big chunks of disk for hash tables. The point of course, is to check arbitrary possible plaintext that a test decryption produces.
There are several serachable dictionaries on the web. That might be a good place to look for search code.
Adam
-- "It is seldom that liberty of any kind is lost all at once." -Hume
--- My preferred and soon to be permanent e-mail address:unicorn@schloss.li "In fact, had Bancroft not existed, potestas scientiae in usu est Franklin might have had to invent him." in nihilum nil posse reverti 00B9289C28DC0E55 E16D5378B81E1C96 - Finger for Current Key Information Opp. Counsel: For all your expert testimony needs: jimbell@pacifier.com
participants (4)
-
Adam Shostack
-
Black Unicorn
-
Michael B Herf
-
Simon Spero