oramfs - ORAM filesystem written in Rust

Tue Jul 6 20:31:55 PDT 2021

On 05/07/2021 22:23, Travis Biehn wrote:
[...]
> The ORAM topic is fresh to me, maybe it's time to do a deep dive on
> the academic work. Happy for other examples or pointers to content
> that might help.

Quick - well maybe not that quick - recap. I will start with what will 
seem like some digressions, but which aren't entirely so, and try and 
pull the threads together. It's a bit untidy as it contains bits written 
for other purposes, sorry.

The first encrypted filing systems encrypted individual files, but not 
the associated metadata like file names, directory structure etc. This 
concealed file contents to some extent but not eg file size, file access 
patterns etc, which could be used to guess the file's contents. They did 
nothing to hide file existence.

Deniable file systems, where an observer cannot see the files in the 
system and the existence of files or even filing systems can only be 
demonstrated by use of a key, began with Rubberhose in about 1995 (an 
Assange project, though aiui it was Suelette Dreyfus who did the work).

They were formalised as steganographic file systems by Ross Anderson in 
about 2000 and there were a few papers shortly after which didn't 
contribute much more.

These file systems were on private storage, and the idea was that an 
attacker would get a single snapshot of the raw encrypted and steganised 
file system in the private storage, through eg maid attack or seizure, 
to base his attack on.

If a key was given or coerced then it would be cryptographically 
difficult for an attacker to say that there were more keys which 
concealed other files.

This works, sort-of, in the "gimme the keys to access the files or go to 
jail" law-enforcement case, but in the Rubberhose use case, "gimme the 
keys and I'll stop torturing you" torture case not so well - a rational 
torturer may just keep on torturing forever.

In 2005 Truecrypt filled an entire storage space with random data, then 
overwrote this partially with the encrypted files and metadata, 
sometimes as a blob, which could also be a file inside another blob. 
When properly set up this could both provide obscured file existence and 
steganographic file hiding.

With the growth of cloud storage it became desirable to create a file 
system which provided file existence hiding when the main storage is 
public (as opposed to private but subject to snapshot copying) and an 
attacker could see all the block reads and writes.

In order to get file existence hiding when the storage is public an 
important extra requirement is file access hiding, as otherwise the 
existence of files can be deduced from access patterns.

As block accesses are public, we have to somehow separate block accesses 
from file accesses. There are other considerations like an attacker can 
replay the file system to any previous state, so the situation where a 
key which accesses a deleted file is still in use, and may be given 
up/demanded/coerced should be avoided.

Ongoing attempts have been made, by myself and others, to extend the 
techniques used in steganographic file systems to create observable 
deniable/steganographic filing systems using public storage, but so far 
they have failed.

The idea of oblivious transfer, where one piece of information is sent 
but the sender does not know which piece, had been around for a while, 
and in 1981 Michael Rabin showed it was possible to do it. This 
developed into the wider field of PIR, private information retrieval.

This seems an obvious candidate for our 
deniable-file-system-with-public-storage, but unfortunately in single 
database PIR the database has to do more than serve up existing file 
blocks, which makes it unsuitable for use in many cases.

[PIR could perhaps be used to create an untraceable messaging service on 
the twitter-message-length scale. The problem here is scaling, because 
in order to be oblivious the system has to look at each bit of data in 
the system for each oblivious read. However it might be possible to do 
some one-look-for-multiple-reads trickery.]

Around 1990 Oded Goldreich and later Rafael Ostrovsky came up with the 
idea of oblivious RAM. This from Wikipedia: "A Turing machine (TM) is 
said to be oblivious if for any two inputs of the same length, the 
motions of the tape heads (=the pattern of reads and writes) remain the 
same." gives some idea.

This was initially meant to be used for software protection, preventing 
an attacker with access to RAM from learning about the operation of the 
software by analysing RAM reads and writes with different inputs.

Note however that other methods, eg an examination of the single pattern 
of reads and writes, are not defended against in the Wikipedia quote, so 
in practice the motions of the tape heads are not identical but only 
probabilistically indistinguishable.

An ORAM is actually better suited to input protection by obfuscation 
than software protection.

Then came the great idea, why not use an ORAM (an ORAM is a program in a 
TM or computer, not an algorithm) to connect between the private 
workings of the user's computer, and public storage (eg the cloud) as 
the RAM in the ORAM model, and thus create an observable 
deniable/steganographic filing system.

Unfortunately I don't know of any ORAM OSFS which actually work in 
practice. I haven't looked at the one which began this thread in detail 
though.

Problems of bolting ORAMs onto computers include the fact that file 
accesses are not all the same length or with the same temporal size 
distribution, and they vary by user - (we would also like to obscure 
which user is in play) - and thus it is very hard to set default 
overheads. Also the assumptions for a theoretical ORAM may be very 
different to the requirements of a practical file-existence-hiding program.

And probabilistic indistinguishability, a requirement for a secure ORAM, 
is a slippery concept especially when eg the total quantity of data 
available to an attacker is open-ended.

Peter Fairbrother