We need to build, share, and use stuff like oramfs more. I don't know what to say to cause that.
Yes, The development, open distribution, and use of tools like ORAM-FS is important.
Here's where I'm at;
A frame; just one example of the differences between windows' early NTFS file encryption and 'TrueCrypt''s approach. In NTFS the structure of the filesystem was not encrypted, so an adversary could see all the filenames and metadata but no content. In a TrueCrypt volume an adversary has an opaque blob.
An adversary can look at r/w access to a TC-like blob (a non-ORAM encrypted FS) and determine what filesystem is in use, then the attacker might guess at the boundaries of individual files, determine the specific implementation of the filesystem (a specific version), the Operating System writing to it, and when some typical files are being written to or read from. If you don't hook any commodity software up to the ORAM-FS then the attacker can probably at most glean the filesystem type and the boundaries of individual files. Depending on the filesystem they may also recover more structural information.
This is fun =) I can get into funny states of mind in topics like this, so if it gets weird I'm sorry. As you quoted, you can get way more information than that. I have not been through college, myself. They can train a machine learning algorithm around common types of files and identify the file types. For many files, also the content. They can also observe your behavior via other channels to learn how it relates to your disk activity and infer things about what you are doing. To do that, they have to think of it, realise that it's possible, and research it. I don't see a clear benefit when the files being r/w'd are a variety
that your attacker can't predict (a mix of non-standardized mission specific artifacts). But I see an advantage if they can.
They can theoretically classify your different file types and uses based on the different access patterns those types and uses have. Once classified, if they have another channel or the usage between classes has meaning, they can begin analysing or predicting content and use to some degree. To do that, they have to think of it, realise that it's possible, and research it. It looks like access patterns are really useful when the domain of the
data is constrained (in structure and type, or perhaps the access domain (e.g. search)); e.g. medical records and emails.
That will certainly make it easier. The ORAM topic is fresh to me, maybe it's time to do a deep dive on
the academic work. Happy for other examples or pointers to content that might help.
I have no academic experience myself, maybe others do. I was dissociated while posting this and may have stated something false as true. I am surprised to have written it so succinctly, maybe that happened because I didn't review it for accuracy. Please support community software.