IQNets: OFFSystem - cursory review and conclusion

25 Oct 2019

      The OFFSystem
https://en.wikipedia.org/wiki/OFFSystem
and similar (that "free speech" XOR concept it was related to and/or
based on), is kinda cool at first glance.

It uses RAID disk type principles to mix blocks (/files), to
successfully mess upthe concept of copyright on a block:

Choose a block size, say 128KiB.

Take an original "copyrighted" block A of a file, which we want to
store in our file/cache/block store.

XOR A together with some other random block (let's say, a block B
containing completely random data) - this produces a new block C
which is a mathematical version/ combination/ "encoding", of block A.

We add this new XOR block C into the store, and we can discard the
original block A, since by running XOR again, but on blocks B + C, we
are able to produce the original block A; of course we must store the
XOR block map somewhere too, so we remember which blocks to XOR.

The new block C is therefore quite arguably an encoding of the
original copyrighted block A.

And now, this new block C also looks like completely random data, and
statistically it is, just not when XORed with one very special (in
relation to block C) yet otherwise completely random, block B.

Next we wish to add another block D copyrighted by some other party
entirely, and you happen to randomly choose as your "random block for
XORing" block C, producing block E (which you add to the block store)
and again discarding this time block D.

Original copyrighted, now discarded blocks: A D

Block store: B (pure random data), C and E ('encodings' of A and D)

Who owns the copyrights on the various blocks?:

Block A is regenerated using B XOR C.

Block D is regenerated using E XOR C, where C was chosen completely
randomly.

Block C is an encoding of A.

Block E is an encoding of C, so E is also, quite evidently, an
encoding of A, and in fact E { XOR C XOR B } gives A, so copyright
holder of A can quite legitimately, within the assumption that he has
copyright over all encodings of A, claim copyright over block E.

But this would probably not be considered satisfactory to the
copyright holder of block D, who can also with mathematical
certainty, demonstrate that block E is an encoding of block D.

---------------------------

In the "free speech" version (HTML paper courtesy grarpamp), this
principle was used with a distributed "random data" 128KiB block
store - sort of P2P, although it wasn't automated and this was 2000,
so the link protocol was FTP, with the intention being the ability to
distribute messages over the internet, exposing those messages "at
some time in the future" via "just happened to notice these blocks
ABC XOR to produce message XYZ".

---------------------------

Utility appears to be confounding copyright holders.

Utility for "free speech"? Not so much, since:

  - "speech", such as chatting and email demand various "reasonable
    maximum" latencies
    - for email, a few minutes max latency is desirable, often much
      longer can be tolerated, but still only up to "hours"
    - for chat/text/forum, a typed message needs to appear within
      seconds, ideally milliseconds

What about storage requirements? Although some folks and systems are
designed to record all history for a chat room, and email lists tend
to work this way by default, there are many use cases for throwaway
chat rooms and flippant discussions about the weather which most
folks would never bother or want long term storage for.

XOR/RAID-type block stores, where new blocks are created and added to
the store based on randomly chosen prior blocks, have the apparent
(on first look at least) requirement that all older blocks are kept
and never deleted - there may be some other blocks in your store, or
in the block store of some peer, depending on your old block.

Secondly, except that compression is applied before storage, there is
no compression of compressible data (this is solvable in end user
software of course - just compress all new blocks before storing, and
remember to notate the compression algorithm used).

Thirdly, deduplication is not part of the above systems, and would
not appear to add much utility anyway... most material is new
material and some git style content mapping/ addressing solution is
what's really wanted in an upper layer anyway.

So we have a significant unavoidable problem built in to this type of
system:

  - an ever expanding block store, where older blocks can never be
    deleted since they may be XOR encodings of newer blocks, in yours
    or other's' (in the case of a distributed store) block stores

---------------------------

Storing interesting content?:

Some content is illegal in one jurisdiction, whilst being legal in
another jurisdiction. So what about the case where you happen to be
in a jurisdiction where such content is illegal in your
jurisdiction?:

Someone with access to your block store, can run permutation XORs up
to large counts, with enough computing power.

This can be described also as "pot luck" - do you want to entrust
your defence against local laws to pot luck?

Of course, blocks can be hidden with encryption, but then we've
already got veracrypt, so what does the extra XOR/RAID layer add?
Nothing AFAICT.

And if you're storing copyrighted content in your block store to
which you simply don't have lawful right to access/view/share,
then your block XOR maps must be stored somewhere anyway, with an
application that can read and present you with your illicit copyright
infringing library of content, for local viewing. That app could put
a password on that, but now we're back to password management,
crypto password salting, key blocks for multiple keys and etc - i.e.,
we're back to veracrypt/luks, and again, why bother with OFFsystem?

---------------------------

Conclusion:

XOR/ RAID stripe type block mixing appears potentially great for
confounding copyright holders in an actual court case.

This could be a good way for a library to store their digital media -
"no Judge, if we remove those copyrighted blocks, we are removing the
encodings of many other digital works we store; can't do, won't do".

"Potentially contentious" media would be consistently followed by a
bunch of other media, all intermingled (this is another time factor
problem - in OFFsystem, earlier blocks get more dependencies on them,
to later blocks).

IQNets: OFFSystem - cursory review and conclusion

Zenaan Harkness