Here are some specifics on the data types, surprises, and questions:

Originating party requested data services that were 100% onload
guaranteed, specifically indicating source as an analogue signal
digitisation system that did not have sufficient buffer capacity.
Initial requests were for linear buffer but then changed to block file
storage and public NAS capability. A similar request for SQL or
distributed database storage in cloud hosting was also fielded by many

Data structures are standard floats in spherical coordinates for 4D
vectors, include some reference table indexes in most of the formats,
and have some distinct ranges in a "small" selection of sample data:

Time is offset (not unix) close to a western military standard but
varies in density.

Precision of Floats in 3D vector is trimmed, indicating a specific
physical resolution.

One of the electronic signal log files includes a standard signal
characteristic for antenna direction in addition to location vector,
typical of cell and e-war systems. Also includes values that may be
rate of signaling or CPU processor speed(?).

Most of the data uses index values, range is linear 0..count.

Some of the data uses both an index and unique identifier, another set
uses a large bit scope value assumed to be a hash, but its structure
has been identified as a structured tree, possibly a known standard
(described below)

For each structure type, there are additional values related to the
signal characteristics and some indexing/classifier but none related
to a identifiable pattern other than sequentially loaded index tables.

We are very concerned about the consistency of the data, one must
assume that a full SPOOF is possible with calculated generation,
however some selections map accurately into adjusted-coordinate 3D
structures such as office buildings, houses, and viable speed tracking
on highways. A party with direct access is preparing maps. Our
interest is to prepare distributed processing techniques to
consolidate rendering of the entire snapshot.

One set is obviously electronic device data, another is most likely
EM(?) tracking of coin and currency objects, another includes more
precise vectors and a large unique identifier value and is extremely

There is no statistical anomaly of missing data per region (coverage
of entire planet), the density of records is consistent and in all
small selections the data has high correlation with physical locations
including terrain and structures, aircraft routes, highway speeds, and
typical patterns at an accuracy that would require the same knowledge
to artificially generate.

More importantly: The coin & currency tracking data maps FAR TOO
CLEARLY into reasonable commerce patterns, coins into and out of
*registers*, bank trucks and storage. Without a full 3d model and a
huge computational effort to simulate global commerce, it is more
likely that a high precision radar system or sigint capability is
actually tracking these targets.

The large bit scope and header reference of one data set is especially

10-12 billion unique identifiers using standard genetic expression
encoding values in tree form and a related signal characteristic

Tracked at 0.25m resolution. With signals. Log density may be due to
AD sampling resolution. Data is historical, mid-year 2014.

On Wed, Jun 10, 2015 at 10:37 AM, Troy Benjegerdes <hozer@hozed.org> wrote:
> You don't keep 120+gbps running without some government backing you.
> I can only think this is some sort of major political statement, by
> some people with significant political (and real) capital to spend.
> Who's got the influence and money to do this, and why? I can only
> imagine it's some sort of reaction to the USA freedom act.
> So if you think your data collection system might now be illegal,
> do you open source it because it'll spill the beans on the banksters
> who double-crossed you?
> Regardless of why, how do you manage data integrity of such a large
> dump so you are not looking at intentionally manipulated data?
> On Wed, Jun 10, 2015 at 09:17:59AM -0400, Wilfred Guerin wrote:
>> Files are standard DB Table dumps (packed) loading from a cluster of
>> VPNs from torrent and NAS protocols through central europe (entry
>> providers are all in privacy-sensitive countries) and intended to be a
>> distributed database service; there is simply nothing big enough to
>> handle this onload directly. (at 120+gbps bursts) Some of the services
>> are posting public torrent data and open sql database access. Table
>> files are set up as redundant master with cross-population and
>> standard distribution techniques. Some of the tracking data appears to
>> have 1 inch resolution target vectors.
>> On Wed, Jun 10, 2015 at 8:52 AM, Griffin Boyce <griffin@cryptolab.net> wrote:
>> > Wilfred Guerin wrote:
>> >>
>> >> Some huge *meaning close to exobyte size* data sets are circulating in
>> >> storage clouds this last week, appear to be snapshots of signals
>> >> intelligence metadata including vector tracking of signals targets
>> >> (possibly cell phones based on movement vectors) and cross-associated
>> >> metadata for their communications. Indications are that these are
>> >> recon signal dumps of the american sigint system loaded by a major
>> >> organized crime syndicate and cover most of last year. There is also a
>> >> set of organic tracking signals, assumably covert agent
>> >> communications, and another set that appears to be all American and
>> >> European cash money transactions(???).
>> >
>> >
>> >   Links to more info?  Are these intended to be public, or some kind of
>> > config failure?
>> >
