[DFDL-WG] thread on bit order and byte order in mil-std-2045 and related formats
Steve Hanson
smh at uk.ibm.com
Fri Jul 25 12:23:22 EDT 2014
Do the TDL standards also include 'positional' bit indicators for optional
elements, 'repeat' indicators for repeating elements, and character
strings which are variable + <DEL> or fixed?
Regards
Steve Hanson
Architect, IBM DFDL
Co-Chair, OGF DFDL Working Group
IBM SWG, Hursley, UK
smh at uk.ibm.com
tel:+44-1962-815848
From: Mike Beckerle <mbeckerle.dfdl at gmail.com>
To: "dfdl-wg at ogf.org" <dfdl-wg at ogf.org>,
Date: 25/07/2014 16:16
Subject: [DFDL-WG] thread on bit order and byte order in
mil-std-2045 and related formats
Sent by: dfdl-wg-bounces at ogf.org
I've included material here from a separate email thread discussing the
bit order and byte order requirements in order to handle mil-std-2045 and
related tactical data link (TDL) standards (one is known as link16). These
TDL standards are not publicly available though limited information about
them (their names, and purposes) is available on Wikipedia.
Much of the discussion thread revolves around link16 and the additional
requirements it has above and beyond what the (publicly-available)
mil-std-2045 headers require.
What Russ Hawkins (link16 SME) describes in the thread below as:
"first byte-level bit order reversal over the entire (element's data),
then individual parsing of each field and performing a bit reversal on the
post-parse field value; or the equivalent functionality of these two
cascaded bit order operations (byte and field level bit order)."
This is in fact equivalent to what
dfdl:bitOrder='leastSignificantBitFirst' means, and is how it is
implemented inside Daffodil. This transformation is required for all the
fields in MIL-STD-2045, as well as the code-points of the 7-bit characters
and 6-bit characters.
The primary conclusions from this thread:
(a) no new byteOrder enum is needed - the byte/word swapping isn't a
byte-order thing, it's a pre-processing pass.
(b) 6-bit chars are needed.
Like all email threads this is in reverse chronological order, so you may
need to peruse from the bottom.
Mike Beckerle | OGF DFDL Workgroup Co-Chair | Tresys Technology |
www.tresys.com
Please note: Contributions to the DFDL Workgroup's email discussions are
subject to the OGF Intellectual Property Policy
________________________________________
From: Mike Beckerle
Sent: Thursday, July 24, 2014 5:22 PM
To: Hawkins, Russ C
Subject: RE: Bit-order - Please review sections 1 to 5 ?
Thanks Russ,
I did have the word-swap notion wrong. However, as a preprocessing pass
this doesn't need to be represented in a DFDL property quite yet.
And yes, if you have values that have to be mathematically combined with
small-ish calculations that is supported in both parsing and unparsing
directions.
...mikeb
________________________________________
From: Hawkins, Russ C [russ.c.hawkins at lmco.com]
Sent: Thursday, July 24, 2014 5:10 PM
To: Mike Beckerle
Subject: RE: Bit-order - Please review sections 1 to 5 ?
Mike,
It is only a preprocessing step optionally performed over the entire body
content. The reordering of bytes from byte order 2143 to byte order 1234
is only a preprocessing pre-parse operation (or post-processing after
unparse). Based on wording in your email below, I wanted to be sure that
this occasional need does NOT swap 16-bit word-pairs (byte order 3412 to
byte order 1234) but each pair of bytes are changing places with each
other (byte order 2143 to byte order 1234), since the below sounded like
the former.
I took a quick look through the ... spec and came up with several fields
well over 32 bits. None appear to require integer type. Some spares are
much greater than 32 bits, off-byte bounds, and must be zeroed for
unparse. I assume that is possible. I don't think any fields are true
integers and I found no float/double reals. Many fields exist that exceed
24 bits and many others must be concatenated ... from different places in
the binary to exceed 24 bits. I assume the latter can be accommodated
with expressions that calculate the final value by multiplying and adding.
Russ
-----Original Message-----
From: Mike Beckerle [mailto:mbeckerle at tresys.com]
Sent: Wednesday, July 23, 2014 8:33 PM
To: Hawkins, Russ C
Cc: Riechel, Jay P; Christensen, Craig M
Subject: EXTERNAL: RE: Bit-order - Please review sections 1 to 5 ?
One more question.
Assuming I have correctly pre-processed the msg payloads to swap the
16-bit words, then I am now ready to extract fields one by one.
Are some of the fields now going to be 32-bit integers in this mixed
endian byte order (I was calling it littleEndian16Bit) where the first 16
bits are the least-significant 16 bits, and the second 16 bits are the
most significant 16 bits ?
Or is the reordering that involves swapping the 16-bit words really ONLY
needed by the pre-processing of the message payload?
Thanks again
...mike
________________________________________
From: Hawkins, Russ C [russ.c.hawkins at lmco.com]
Sent: Wednesday, July 23, 2014 12:58 PM
To: Mike Beckerle
Cc: Riechel, Jay P; Christensen, Craig M
Subject: RE: Bit-order - Please review sections 1 to 5 ?
Mike,
I think your email correctly characterizes the situation - as long as in
Step 2 'using the required least significant-bit first bit order" includes
both: first byte-level bit order reversal over the entire body, then
individual parsing of each field and performing a bit reversal on the
post-parse field value; or the equivalent functionality of these two
cascaded bit order operations (byte and field level bit order).
Since you are considering performing this using a multi-pass prototype, I
I will say how our parser performs all these cascaded operations we've
been talking about, but does so with a single parse or unparse pass:
(Here Russ describes a multi-pass scheme that they have implemented in
some data parsing system.
This is interesting enough in its own right that I've excerpted it into a
separate dfdl-wg email item.)
Instead of using multiple passes, ...
-----Original Message-----
From: Mike Beckerle [mailto:mbeckerle at tresys.com]
Sent: Wednesday, July 23, 2014 8:41 AM
To: Hawkins, Russ C
Subject: EXTERNAL: RE: Bit-order - Please review sections 1 to 5 ?
So I would characterize this in this way:
Step 1: A source-site specific pre-processing pass that re-orders 16-bit
words. Only done for some sites.
Step 2: Parsing the fields of the message using the required
least-significant-bit first bit order & little-endian byte order.
If I have this right, then this requires a small application, not just the
DFDL parser. The application would call the DFDL parser to parse whatever
header data, and get a byte array for the payload data that must be
(sometimes) pre-processed. The application would then decide whether to
pre-process, i.e., swap the 16-bit words of the byte array. The
application would then call the DFDL parser again to parse the actual
message fields.
This is typical of processing when data is encoded and must be decoded
before being parsed. E.g., we have the exact same issue with email
messages, where the message may have sections encoded via the base64
scheme. This must be decoded, and then the data parsed. (Another format)
has the same issue in that mil-std-2045 header can say the body is
gzipped. Ungzipping it isn't something we do (today) in the DFDL parser.
Some day we hope to add this sort of multi-pass capability into DFDL -
probably DFDL v2.0 however. In the mean time this sort of thing requires
that the multiple passes are managed by a small application. We plan to
use the Daffodil implementation to experiment with features that allow
this sort of multi-pass capability.
________________________________________
From: Hawkins, Russ C [russ.c.hawkins at lmco.com]
Sent: Tuesday, July 22, 2014 7:18 PM
To: Mike Beckerle
Subject: RE: Bit-order - Please review sections 1 to 5 ?
Mike,
All msg types exhibit the same behavior. For a given site, if any msg
types require the littleEndian16Bit byte-pair swapping, they all do. It
is not unique to any one msg type. This particular byte-pair swap is
unique to the site and its supplying feed source.
Then after the above site-dependent byte-pair swap, every site does
require the byte-level bitOrder reversal though for our parser to parse
(extract) the fields left-to-right - this provides fields that cross byte
boundaries with a contiguous bit order that has all bits, including
Character fields, in the correct relative position, although
bit-reversed. Then because of that field level reversal that we
introduced, every field, including Characters (including 6-bit Characters)
must have its bits reversed to get the intended value.
I have looked over our ... msg type parse tables and verified that it
exhibits this behavior in its entire body, including its 6-bit Character
fields.
Russ Hawkins
-----Original Message-----
From: Mike Beckerle [mailto:mbeckerle at tresys.com]
Sent: Tuesday, July 22, 2014 4:26 PM
To: Hawkins, Russ C; Stephen Lawrence; Taylor Wise
Cc: Christensen, Craig M; Riechel, Jay P
Subject: EXTERNAL: RE: Bit-order - Please review sections 1 to 5 ?
Thanks Russ,
I looked over your comments.
I've got the 6-bit charset encoding. That we can add easily.
I understand from your comments that for some message types the whole
message payload (but not header) has word and byte swapping that must be
done as a pre-processing pass before the payload can be parsed into its
multiple fields.
Is there an example specific message type that has this characteristic? If
I correctly recall prior email threads, I think you said there is no clear
indicator when this pre-processing is required, but I wanted to make sure
I'm understanding this correctly.
...mikeb
____________________________
From: Hawkins, Russ C [russ.c.hawkins at lmco.com]
Sent: Tuesday, July 22, 2014 1:12 PM
To: Stephen Lawrence; Mike Beckerle; Taylor Wise
Cc: Hawkins, Russ C; Christensen, Craig M; Riechel, Jay P
Subject: RE: Bit-order - Please review sections 1 to 5 ?
Mike,
Some preliminary comments embedded with initials RH.
Obviously a lot of thought and work went into this, however some things
were confusing to me. Noted need for 6-bit Packed ASCII encoding and
possible need for bitOrder and 16bit byte order reversals for 6-bit ASCII
and other non-numeric fields.
Russ Hawkins
-----Original Message-----
From: Stephen Lawrence [mailto:slawrence at tresys.com]
Sent: Thursday, July 17, 2014 9:24 AM
To: Mike Beckerle; Hawkins, Russ C; Taylor Wise
Subject: EXTERNAL: RE: Bit-order - Please review sections 1 to 5 ?
Made a couple of comments (attached). Some things in section 5 were a
little unclear to me.
- Steve
________________________________________
From: Mike Beckerle
Sent: Wednesday, July 16, 2014 4:25 PM
To: 'russ.c.hawkins at lmco.com'; Stephen Lawrence; Taylor Wise
Subject: Bit-order - Please review sections 1 to 5 ?
Russ, Steve, Taylor,
Attached is an internal DFDL Workgroup working document about bit order.
It is my personal internal draft as I'm the author.
I believe I now understand why bit order has been so hard to convey to
others in the DFDL workgroup.
It is because there are two different formats both of which can be
considered to be "LSB first". This document teases them apart.
All review comments gratefully accepted. If it is easiest, you can edit
the document directly and put in MS Word Comment bubbles. Then I can merge
them. Or however you prefer really.
...mike beckerle--
dfdl-wg mailing list
dfdl-wg at ogf.org
https://www.ogf.org/mailman/listinfo/dfdl-wg
Unless stated otherwise above:
IBM United Kingdom Limited - Registered in England and Wales with number
741598.
Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.ogf.org/pipermail/dfdl-wg/attachments/20140725/52937e7e/attachment-0001.html>
More information about the dfdl-wg
mailing list