[ot][spam][personal] Behavioral/Rambling Brownian Motion towards Subgoals

Mon Jul 19 05:33:18 PDT 2021

0756

Regarding backing up, it will be slow to upload data from a home network to
filecoin.  I'm thinking on setting up some way of securing it with a hash
before it uploads.  I wonder if there existing tools to make a merkle tree
out of a binary blob.

I think I will make my own, or use libgame's.

I'm thinking of streaming data from a disk, hashing every 32GiB as they
come in.  This could be done with a script.

We'd have a format for the hash output.  It could either be a single
updating document, or a tree.

This would be a hashed but unsigned document.  But maybe I could sign it
anyway so as to quickly verify it with familiar tools.

Alternatively, the various sum tools have a verify mode.  They use a
detached document.

The goal is to reduce the time window during which viruses can mutate data.

I may simply be paranoid and confused.  In fact, I am usually these two
things.

0801

Thinking on a chunk of data coming in.  Hash the data and give it a name.

Datasource-offset-size-date

That seems reasonable.  Then we can make verification files of those files,
I suppose ...

0803 ...

Datasource-index-size-date

Seems more clear and useful.

This is similar to git-annex ...

0804 ...

0806

Here's a quick way to make a secure document hash:

for sum in /usr/bin/*sum; do $sum --tag $document; done 2> /dev/null

Now, considering filecoin, it would be nice if we also had the deal
information.
The fields of interest include:
- numeric deal id
- payload cid
- piece cid?
- miner address?
- sender address?
- deal cid?  hard to find or use this with the interface, seems most
relevant
- block hash?

Deal information can be found with `lotus client get-deal`

0810

0811

It seems reasonable to provide the DealID and Proposal Label from client
get-deal.

So.  We'll get say a new block of data or something.
If the data is in small blocks, we would want to add this information to
some hash file.  Say the DealID.

I guess the deal CID would make more sense in a hash file =S maybe I can
ease myself by using some xxxx looking label.  The ProposalCid seems to be
the same as the deal CID, unsure.

I could also put a deal id into the filename, but then the old hashes don't
immediately verify.

Thinking a little of a document containing nested documents.  It could hash
itself, and append further data, and append a hash of above data ...  then
we could add deal ids farther down and still have the same canonical
previous material.

0817 .

0818

I looked into using gpg --clearsign, but I don't like how the encoding of
the pgp data makes it hard to debug corruption.  The hash is not readily
visible.

I'm considering ...

0820, inhibition

Considering normal tools for breaking a file based on boundaries.  Nested
text files.

I'm wanting it to be easy to break out the right parts of the files to
verify things, when I am confused.  I'd like to be able to verify only some
of the files and hashes, when I am on a system with limited tools.  This
means extracting the document's own verification document out, to verify it
with.

Now thinking it is overcomplicated.

Basic goal:  easily check that the data I uploaded is the same as what I
read from the drive, at the time I read it.  So one hash, and then a way to
check that hash.  We download data from the network, maybe years from now,
and check the hash.

I guess it's an okay solution.

Considering using a directory of files.
Then as more documents are added, a hash document can extend in length, or
a tarball can be used.

Considering hashing only certain lines of files.  Some way to generate
subfiles from longer files.  Generate all the subfiles, and you can check
everything.

We could have a filename part that indicates it is lines from another
file.  Or greps, even.

0826

While I'm making this confused writing, my system is generate 64GiB of
random data to check if filecoin automatically breaks pieces larger than
its maximum piece size.  I should have used zeros, but it's mostly limited
by write speed I expect.  It's hit 38GiB so I'll test it.

0827

0828

I told lotus to import the data while it was being generated, hoping it
will import truncated data rather than failing.  It's silently spinning the
CPU.

Regarding a quick hash document, let's see.

Import data in chunks to a folder.  Hash data into a new file.  Hash all
hashes into a new file.

Maybe that works.  Using new files, the system is more clear.  How does
verification work?

We roughly verify files in reverse order to their creation, and we verify
only verifiable files.

So it's possible a system could be checked with something like
check $(ls *.checkable | tac) | grep -v missing

where check runs through /usr/bin/*sum or such

0832

muscle convulsions associated with plans and decisions

0833
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: text/html
Size: 7716 bytes
Desc: not available
URL: <https://lists.cpunks.org/pipermail/cypherpunks/attachments/20210719/ab469bdb/attachment.txt>