Query on "secure databases"

Sun Feb 16 14:41:41 PST 1997

Hi, Peter - here's my previous posting, with an addendum, Cc: cypherpunks
At 04:43 PM 2/11/97 -0500, Peter Swire <swire.1 at osu.edu> wrote:
>        Can you point me to anything I can read on "secure databases"?  I
>hadn't hit that term before.  I am curious to what extent they are secure
>due to cryptographic approaches, or to what extent you rely on other
>mechanisms for keeping out unauthorized users.

It's been a long time since I've looked much into it, since I
was getting out of phone-company-defense-contractor mode at the time the
field was developing.  I think the NCSC Rainbow Book that deals
with secure databases is the Purple Book, though it might be the Gray.
Oracle and/or Sybase have done some work trying to do databases
at the B1 or B2 level, but I don't know details.  

I'll talk about secure databases a bit, but for the real world
there's more useful stuff that isn't related to them.

=================== Begin "Secure Database" Section ==============
At the time, the Orange Book (security for standalone computers) was 
well-understood, though there weren't really any systems above B2 level,
and the Red Book was written, but nobody really had a clue how to
implement multi-level machines on a multi-level LAN, though
you could use encrypting Ethernet cards to get single-level machines
of different levels on the same LAN, if they were all administered
by the same group of people (so user-IDs and security levels
worked across the entire system.)  

At the lower security levels, Orange Book techniques mainly involve 
removing bugs and adding accounting features.  At the higher levels, 
the key is to come up with a good mathematical/logical model of
interactions in your computer system, and then design a computer system
that only does the things permitted by the model and prove it does that.
Some of these systems use cryptographic techniques, including some of
the capability-based systems - see Bill Frantz's KeyKOS work, which he's
referred to on Cypherpunks occasionally. 

The objective is to have databases that can be accessed by multiple users
with different sets of security permissions.  In the military vernacular,
this mainly means having UNCLASSIFIED, CONFIDENTIAL, SECRET, and maybe
TOP SECRET data in the database, where not all users are cleared
to the highest security level, and maybe projects X, Y, and Z,
where users of project X may not have a need to know for project Y, etc.

The easy problems are probably solved by now - either you do a good job
of verifying the design and bug-free-ness of your database software
so you can be sure that each request includes, and obeys, security levels,
or you use crypto to encrypt the data items for each security level
(e.g. everybody with a SECRET or PROJECT Z clearance shares a key 
for that data, which isn't really ideal for non-small groups of people,
the database just tries to do a good job, and you only do operations in
the database that make sense when the data is encrypted, though that's
annoyingly restrictive for many applications.)

The sticky problems are aggregation-related.  It may be ok if an uncleared
person knows that military base A has missile types 1, 3, and 5,
and if the uncleared person knows where all your military bases are.
But is it ok if the uncleared person knows about _all_ the missiles
at _all_ the bases, and can tell that nobody's using Type 2 missiles
any more, and that all the Type 4 missiles got moved to New Jersey last week?

=============== End "Secure Database" section =================
=============== Begin "Real-World and Crypto" section =========
For the average person, though, the simple military models aren't 
really useful.  You really care more about groups of people,
though the multi-level stuff is sometimes a good way to keep 
machines from being hacked.  

But the real problem is that computers are very good at correlating
information - 
if two different companies know your Social Security Number,
and they share their data with each other (maybe for a price),
then they can all tie together everything they each know about you.
And there's no way to stop it, except by limiting the information
you give people (which limits the transactions they'll do with you),
or contractually limiting what they can do with it (good luck),
or using multiple identities - and even then, computers are
often good at guessing that the John Doe at 1234 Main Street
is the related to Jane Smith at 1234 Main Street, and maybe 
the foreign car parts Jane bought on her American Express are
for the Porsche registered to John, so there's a junk-mail opportunity.

There's really only one technology that lets you avoid this,
which is crypto and its relatives - though you still have to 
get people you use business with to use it, which is an uphill game.
Some of the fundamental work on this is David Chaum's papers
on Credentials without Identity (or something about like that, from ~1985.)
For instance, you can have a voter registration number that's
cryptographically signed by the voting bureau, but uses blinded
signatures so they can't correlate the known good unique number
with the Peter Swire who walked in and showed a picture ID one day.
A driver's license smartcard could keep a pointer to your driving records,
but wouldn't give your True Name, and would only show a cop
that the person whose picture is on the front is a Licensed Driver,
and maybe could demonstrate that the bearer knows the card's PIN.

One big issue is that the government is pushing banks to demand SSNs,
and requiring employers to demand SSNs (makes them easy Employee IDs),
and pushing medical insurers and providers to use SSNs (makes it
easy to collect Medicare data.)  You can gain a lot of privacy
just by having employers use their own employee-ID numbering system,
so the travel agent subcontracting to Corporate Travel
doesn't need your SSN on every form they fill out.
But suppose the Social Security Administration and IRS issued you a bunch
of separate numbers, that were either related using a cryptographic key,
or just randomly picked and kept in a big database (maybe pointing to
your old SSN), so you could give everybody who needs a Tax ID a different
number, and only you and the IRS could correlate them.
(To some extent, you can do this yourself by creating lots of
companies, but it's a real pain and costs a certain amount of money.)

=============== addendum =====================
Of course, for many transactions, the way to reduce the privacy problems
is to use bearer certificates rather than book-entry approaches -
pay cash, or digicash, rather than credit cards, use on-line
anonymous delivery of bits, or picking up stuff in person,
rather than mailing stuff to a snail address, etc.

#			Thanks;  Bill
# Bill Stewart, +1-415-442-2215 stewarts at ix.netcom.com
# You can get PGP outside the US at ftp.ox.ac.uk/pub/crypto/pgp
#     (If this is a mailing list, please Cc: me on replies.  Thanks.)