forward: cryptanalysis talk abstract

12 Apr 1993


      Thought people might find this abstract of a talk being given here at
Penn of some interest.  Please let me know if I'm wrong <grin>.
(And, no, I won't be attending; almost all of it would be over my head.
What is in this abstract is probably as much of it as I could understand
without considerable preparation <grin>).

------------------------------------------------------------------------
In article <119753@netnews.upenn.edu>, holland@central.cis.upenn.edu (Billie Holland) writes:
...
Statistical Techniques for Language Recognition:
  An Introduction and Empirical Study for Cryptanalysts
Alan T. Sherman
           Computer Science Department
         University of Maryland Baltimore County
In cryptanalysis, how can a computer program recognize when it has
discovered all or part of the secret message?  For example, how can a
program recognize character strings such as ``Attack at dawn!'',
``DES@RT ST\&RM'', or ``?tta????t d?wn'' as fragments of intelligible
messages?  In the early days of cryptology a human would perform these
language-recognition tasks manually.  In this talk I will explain how
to recognize language automatically with statistical techniques.
Statistical techniques provide powerful tools for solving several
language-recognition problems that arise in cryptanalysis and other
domains.  Language recognition is important in cryptanalysis because,
among other applications, an exhaustive key search of any cryptosystem
from ciphertext alone requires a test that recognizes valid plaintext.
Although I will focus on cryptanalysis, this talk should be relevant
to anyone interested in statistical inference on Markov chains or
applied language recognition.
Modeling language as a finite stationary Markov process, I will adapt
a statistical model of pattern recognition to language recognition.
Within this framework I will consider four well-defined
language-recognition problems: 1) recognizing a known language, 2)
distinguishing a known language from uniform noise, 3) distinguishing
unknown 0th-order noise from unknown 1st-order language, and 4)
detecting non-uniform unknown language.  For the second problem I will
give a most powerful test based on the Neyman-Pearson Lemma.  For the
other problems, which typically have no uniformly most powerful tests,
I will give likelihood ratio tests.  I will also discuss the
chi-squared test statistic $X^2$ and the Index of Coincidence $IC$.
In addition, I will present the results of computer experiments that
characterize the distributions of five test statistics when applied to
strings of various lengths drawn from nine types of real and simulated
English.
This is joint work with Ravi Ganesan.  Most of this work was carried
out while Sherman was a member of the Institute for Advanced Computer
Studies, University of Maryland College Park.
Thursday, 15 April 93
                        TOWNE BUILDING - 337
      	     3:00 - 4:30
-- 
david			david@staff.udc.upenn.edu

david＠staff.udc.upenn.edu

Eric Hughes

tags

participants (2)