Recently, I've been making and breaking pencil-and-paper ciphers specifically to acquaint myself with the art of cryptanalysis. I've developed methods for solving vignere, general transpositions, playfair, etc... just basically bringing myself up to speed with the general background of the field. And then, with my brain sharpened by the classics, I've been returning to things I formerly thought were "really neat ideas" in crypto but couldn't formulate a good reason for why they would be secure, and demolishing them. On most of them, the attacks have turned out to be embarassingly simple; On this one, although I found an attack where known plaintexts or chosen plaintexts could demolish it, to a cryptanalyst guessing cribs it's pretty resistant (it requires depressingly long cribs of a depressingly specific form unlikely to be found at random). Anyway, any fans of pencil-and-paper systems out there, enjoy! Bear The Overlapping Additive Nomenclator (Pencil-and-paper cipher; believed secure against amateur pencil-and-paper cryptanalysts but insecure against computer-assisted or really good pencil-and-paper cryptanalysts -- falls under the heading of "a cool idea that didn't hold up under cryptanalysis") This is a pencil-and-paper cipher, moderately difficult to use and extremely difficult to solve by pencil-and-paper cryptanalysis. I specifically do not assert that it is computationally secure. The cryptanalysis explaining why not follows at the bottom of this paper. In this cipher, the key is a list of numbers, each number corresponding to one cleartext symbol. The following property must hold true of these numbers. For some number N, (N=2 in the following example) One of the two following conditions must hold: Either: All numbers used must differ from one another in first N digits. If the first difference is in the Nth digit, then the difference in that digit must be at least two. Or: All numbers used must differ from one another in either one of the last two digits. (In the example, both conditions are true). For some number D, (D=3 in the following example), Each code number must be at least N x D in length. The cipher provides sufficient confusion to "cover up" runs of identical plaintext characters of length D or smaller -- runs larger than length D will start showing a discernible pattern in the ciphertext. For a key on a small alphabet, consider the following: A 129754 B 148943 D 208531 E 259146 F 268953 Now, if we are to encrypt the message "DEADBEEF" we would proceed as follows: First, the key number is written lined up beneath each plaintext letter, with N columns of digits for each column of letters; with a leading-difference system, the numbers are written starting in the column of the corresponding letter: With a trailing-difference system, the numbers are written ending in the column of the corresponding letter. D E A D B E E F leading difference D E A D B E E F trailing difference 208531 148943 259146 259146 129754 259146 208531 268953 Next, the numbers in each column are added, with any carries spilling over to the next column; 208531 148943 259146 259146 129754 259146 208531 268953 -------------------- 21113564544660643553 The sum is the ciphertext. Decryption proceeds as follows: If we have chosen "Leading difference" as our condition, then we look at the first six digits of the sum and determine which code number is closest to it but not over. We subtract that value from the whole, recovering a new sum. We write the plaintext symbol corresponding to the number we have just subtracted in place of the leading two digits, as the new sum will be two digits shorter than the old. Repeating the process until completion, we get the transformation in the left column. If on the other hand we have chosen "trailing difference" as our criterion, then we look at the trailing two digits, pick the cleartext letter corresponding to these digits, and subtract it from the result. The difference will end in two zeros, which we remove and replace by the cleartext symbol whose code value we have just subtracted. Repeating the process until completion, we obtain the transformation in the right column. 21113564544660643553 21113564544660643553 208531 268953 -------------------- -------------------- D 260464544660643553 211135645446603746 F 259146 259146 -------------------- -------------------- D E 1318544660643553 2111356454463446 E F 129754 259146 -------------------- -------------------- D E A 21004660643553 21113564542043 E E F 208531 148943 -------------------- -------------------- D E A D 151560643553 211135643931 B E E F 148943 208531 -------------------- -------------------- D E A D B 2617643553 2111354354 D B E E F 259146 129754 -------------------- -------------------- D E A D B E 26183553 21112246 A D B E E F 259146 259146 -------------------- -------------------- D E A D B E E 268953 208531 E A D B E E F 268953 208531 -------------------- -------------------- D E A D B E E F D E A D B E E F In a real system, we could have chosen "leading difference" or "trailing difference" and made the key such that it would have been unusable in the other direction. With either mode, we can use code numbers of varying numbers of digits; the requirement of a constant number of digits holds only for systems such as the above which may be solved in either direction. With a leading-difference system, the correct method is that the numbers be lined up with their leading digits under the letter before the addition is done; with a trailing-difference system, it is important that the numbers be lined up with their trailing digits under the letter before the addition is done. Since the "trailing difference" system need not worry about a carry digit altering the key, more symbols may be included in its alphabet. I have used N=2 here as the range of sensitivity, thinking of the roman alphabet plus perhaps seventy code signs. If you prefer a nomenclator-like alphabet of nine thousand code signs, the same technique may be used; you will need to use N=4 instead. I have also used a "digit overlap" or D of three here, meaning that each digit of the ciphertext is the result of adding three digits in a sum, plus a possible carry digit. Better security may be had by increasing the digit overlap (the minimum length of each numeric element becomes N times the digit overlap) but real computational security cannot be achieved by this method; it will frustrate only pencil-and- paper cipherers. Finally, since the ciphertext is "open at its ends", it is best for security to use a predetermined number of NULL symbols at both ends of the cipher; Null code numbers, unlike the actual key numbers, need not differ from legitimate key numbers in the first or last few digits; they need only to differ in the first or last few digits from the code numbers assigned to other null symbols. Leading and trailing NULL symbols make this cipher quite difficult of solution; if you keep track of your supply of leading and trailing nulls and do not reuse them, the cipher is reasonably secure against pen-and-paper cryptanalysis. A successful method for pen-and-paper analysis, however, does exist. If a known plaintext enciphered with some key is captured, a cryptanalyst may set up linear equations constraining the values of the code numbers. If a set of D+1 consecutive symbols is used more than D times in the message under the same key, then the linear equations may be solved and will generally find the code number corresponding to at least one symbol. Once at least a few such code numbers are known, a cryptanalyst may start looking for "cribs" consisting of a seqence of length D consisting of known symbols, one unknown code symbol, and another sequence of D known symbols. Each time such a crib is found, the cryptanalyst will learn another key code number. A cryptanalyst may also look for "cribs" consisting of a sequence of length D consisting of known symbols, two unknown symbols, and another sequence of length D of known symbols; each time this happens, he will obtain a linear equation constraining the values of two symbols. If more than one linear equation is found in this way constraining the value of any two symbols, the key for those symbols can be found. Then of course, the other symbols captured in the linear equations with them from other cribs may also be solved for. What security this cipher has is as a result of the abstracted and difficult form that a crib must take, and the difficulty of finding and identifying suitable cribs. *sigh.* Oh well, it's better than a standard nomenclator or a pigpen cipher. Polyphony or polyalphabetic code substitution could of course be applied to good effect, but that's just obfuscation. Bear