[DFDL-WG] bit order documents updated

Mike Beckerle mbeckerle.dfdl at gmail.com
Fri Aug 1 12:46:53 EDT 2014


I agree that the most significant issue about bit order is that it isn't
really analogous at all to byte order.

Bit order is fundamentally more primitive.

I think I got fooled into thinking bit-order and byte-order were similar,
because we actually have byte-order incorrectly classified. It is in the
section of properties about both content and framing, and says it applies
to sequence, choice, and group.

But if byte order is really about simple content only then it is not
relevant to sequence/choice/group. That is probably a hold-over from when
we had byte-order affecting text for some encodings. Now that byte-order of
text is specified in the encoding only, perhaps byte order is no longer
relevant to sequence, choice, and group. And if that is so, then, I think
we have byte order in the wrong section. Byte order is in the section on
properties about both content and framing, but if it is only about simple
content it belongs in a subsection of section 13. We don't have a section
about "all simple types with binary representation", but that's where it
would logically want to sit since it is shared by binary numbers, binary
calendars, and binary booleans. (Also note that in our enumeration of the
binary types in the description for the dfdl:byteOrder property, we forgot
binary booleans. We need the byte order if we're to compare the
binaryBooleanTrue/FalseRep to a value coming from the data stream and the
number of bits of length is > 8.)

But back to bit order:

Bit order, unlike byte order, truly does affect both content and framing.
It affects the start position of anything that can start at a location that
is not on a byte boundary. From the perspective of our data grammar, this
really does mean any grammar region at all. Bit order is needed to define
bit position, and bit position is needed to define the start and end of
grammar regions because grammar regions start and end at specific bit
positions.

So implications are that anything with alignment must take bit order into
consideration, therefore sequence, choice, group as well as element and
simple type. Furthermore, anything with length must take bit-order into
consideration because bit order dictates how you are numbering the bit
positions and therefore exactly which bits are between the start bit
position and end bit position depend on bit order whenever either the start
bit position or end bit position are not a byte boundary.

Perhaps we need to state exactly this. "A grammar region defines a part of
the data stream starting at a specific bit position and ending at a
specific bit position. Since bit positions within a byte are numbered
relative to the bit order, the specific bits of the data stream
corresponding to any grammar region require consideration of the bit order
any time the start bit position or end bit position of the region are not
on a byte boundary."

I should mention that implementation wise this is much easier to implement
than it sounds (for parsing anyway), because it is only when you actually
extract data from the data stream that you must consider the bit order, and
use it if it is LSB first, to change around the extracted data so as to
access the right bits. Parsing things like alignment fill, that just move
counters of positions, really there is nothing to do there you just
increase counters of bit position. Of course on unparsing, alignmentFill is
equivalent to writing out a binary integer of a specific length, so on
unparsing the bit-order does matter for regions that must be filled.

So bit order is not really analogous to byte order at all. It's a more
primitive thing and belongs in Section 11.

It is only when things are byte-aligned and a multiple of bytes long that
bit order doesn't matter.

That happens a lot in real data, but in DFDL everything allows for the
possibility that things are not so aligned and don't have bytes-long
lengths. The only things I know of that must be byte aligned are text in
encodings that have a mandatory byte alignment. I think everything else can
be bit aligned. Hence, bit order applies to prefix length regions, and
prefix-prefix length regions, element unused regions, choice unused
regions, alignment fill, skip regions, etc. It applies to anything because
it affects the notion of "position" and regions are about the bits between
the start position of the region and the end position of a region.

Bit order and bit-position are more primitive notions on top of which the
notion of regions in the grammar are built.

This is actually a good thing, because otherwise bit order would require
detailed treatment in practically every section of the DFDL spec. Since bit
order underlies bit position, which underlies regions, we can define how
bit order affects bit position, and then the only additional place bit
order needs discussion is where it affects the values of things - decoding
of text characters, and computation of binary simple type values. To the
extent that descriptions of simple type conversions use first and last or
most-significant and least-significant, there is no change needed because
these notions are relative to bit position.

...mikeb







Mike Beckerle | OGF DFDL Workgroup Co-Chair | Tresys Technology |
www.tresys.com
Please note: Contributions to the DFDL Workgroup's email discussions are
subject to the OGF Intellectual Property Policy
<http://www.ogf.org/About/abt_policies.php>



On Fri, Aug 1, 2014 at 8:00 AM, Steve Hanson <smh at uk.ibm.com> wrote:

> Mike
>
> Attached is a review copy of the bit order document with Word comments
> added.  I am particularly concerned about the grammar region implications
> that you have added. We have not discussed this before on the calls.
>
>
>
> Regards
>
> Steve Hanson
> Architect, *IBM DFDL*
> <http://www.ibm.com/developerworks/library/se-dfdl/index.html>
> Co-Chair, *OGF DFDL Working Group* <http://www.ogf.org/dfdl/>
> IBM SWG, Hursley, UK
> *smh at uk.ibm.com* <smh at uk.ibm.com>
> tel:+44-1962-815848
>
>
>
> From:        Mike Beckerle <mbeckerle.dfdl at gmail.com>
> To:        Steve Hanson/UK/IBM at IBMGB, "dfdl-wg at ogf.org" <dfdl-wg at ogf.org>,
>
> Date:        25/07/2014 14:45
> Subject:        Re: [DFDL-WG] bit order documents updated
> ------------------------------
>
>
>
> I will edit and share a discussion thread of the link16 requirements I
> have been having with Russ Hawkins of Lockheed-Martin discussing
> TDL/link16. He is a SME on this topic, having built and used processors for
> that format.
>
> MIL-STD-2045 works today with just the leastSignificantBitFirst feature as
> described. We have some tests. I will publish this DFDL schema and tests.
>
> MIL-STD-2045 doesn't require the byte order mixed-endian behavior.
>
> Mixed endian behavior is something we originally thought link16 required
> that is not illustrated within MIL-STD-2045.
>
> Our understanding of link16 was incorrect however. It turns out link16
> only requires this byte/word swapping as a bulk preprocessing pass over the
> entire body of a message (but not the message header). This is not unlike
> decoding base64 or other encoded data found in a message payload. DFDL
> can't do this today in one pass - an application program that drives a DFDL
> processor is needed for multi-pass formats like this. This is fine for DFDL
> v1.0.
>
> Russ verified that this is the *only* byte/word swapping required for
> link16 - there is no per-element mixed-endian byte order required. Just the
> same least-significant-bit-first bitOrder that MIL-STD-2045 needs. Hence, I
> dropped the mixed-endian byte order from the proposal.
>
> The remaining link16 requirement that couldn't be worked around is the
> 6-bit ascii format.
>
> It is always going to be the case that for some formats, especially these
> big complex ones, that we will not really know if we can model it until the
> DFDL schema has been created. However, MIL-STD-2045 plus a minor additional
> 6-bit encoding is our best shot at having credible support for a wide array
> of MIL-STD formats.
>
>
> Mike Beckerle | OGF DFDL Workgroup Co-Chair | Tresys Technology |
> *www.tresys.com* <http://www.tresys.com/>
> Please note: Contributions to the DFDL Workgroup's email discussions are
> subject to the *OGF Intellectual Property Policy*
> <http://www.ogf.org/About/abt_policies.php>
>
>
>
> On Fri, Jul 25, 2014 at 6:02 AM, Steve Hanson <*smh at uk.ibm.com*
> <smh at uk.ibm.com>> wrote:
> I am very concerned that we do not understand these formats properly. We
> need to engage an SME to verify that these proposals are enough to model
> MIL-STD-2045 and Link-16 (and other related formats). Until that happens
> IBM will not be able to close action 233.
>
> Regards
>
> Steve Hanson
> Architect, *IBM DFDL*
> <http://www.ibm.com/developerworks/library/se-dfdl/index.html>
> Co-Chair, *OGF DFDL Working Group* <http://www.ogf.org/dfdl/>
> IBM SWG, Hursley, UK
> *smh at uk.ibm.com* <smh at uk.ibm.com>
> tel:*+44-1962-815848* <%2B44-1962-815848>
>
>
>
> From:        Mike Beckerle <*mbeckerle.dfdl at gmail.com*
> <mbeckerle.dfdl at gmail.com>>
> To:        "*dfdl-wg at ogf.org* <dfdl-wg at ogf.org>" <*dfdl-wg at ogf.org*
> <dfdl-wg at ogf.org>>,
> Date:        24/07/2014 23:08
> Subject:        Re: [DFDL-WG] bit order documents updated
> Sent by:        *dfdl-wg-bounces at ogf.org* <dfdl-wg-bounces at ogf.org>
>  ------------------------------
>
>
>
> Of course as soon as I post this I get the information needed to resolve
> several things.
>
> The new byte-order enum is not needed, so I removed that material. The bit
> order document is now a fairly clean thing trying to be an errata.
>
> Please review: draft-gwdi-bit-order-features-v3.0.docx
>
> I created a separate working document on "Mixed Endian Byte Order" should
> we ever take this up again.
>
> I also removed the TDML stuff into the TDML document. (now v2).
>
>
> Mike Beckerle | OGF DFDL Workgroup Co-Chair | Tresys Technology |
> *www.tresys.com* <http://www.tresys.com/>
> Please note: Contributions to the DFDL Workgroup's email discussions are
> subject to the *OGF Intellectual Property Policy*
> <http://www.ogf.org/About/abt_policies.php>
>
>
>
> On Thu, Jul 24, 2014 at 5:16 PM, Mike Beckerle <*mbeckerle.dfdl at gmail.com*
> <mbeckerle.dfdl at gmail.com>> wrote:
>
> There are now several working documents having to do with bit-order. The
> prior document has been taken down and replaced by 3 others.
>
> These are all up on redmine:
> *http://redmine.ogf.org/dmsf/dfdl-wg?folder_id=5485*
> <http://redmine.ogf.org/dmsf/dfdl-wg?folder_id=5485>
>
> 1) draft-gwdi-bit-order-features-v2.05.docx - Just describes the new
> features + TDML extension. The biggest section is organized like an errata
> specifying where things change in the spec. This should get incorporated
> into errata, or perhaps just referenced from an errata.
>
> Section 3 of this doc is the proposed errata and spec language for review.
>
> There is one major open issue here, which is it is unclear whether the
> mixed-endian byte order that was previously proposed is actually needed or
> not. The swapping of 16-bit words appears to be not a per-element thing,
> but something done as a pre-processing of an entire message body before
> parsing. This isn't something we can handle in DFDL, much like
> base64-encoded data and so forth. This may be the only place that needed
> the 16-bit word swapping.
>
> 2) Understanding Bit Orderings -
> draft-gwdi-mil-std-2045-understanding-bit-order-v2.05.docx - Material about
> bit order - the Wire model vs. Number model material. This is effectively
> just archiving this material for posterity.
>
> If you already read this, you don't need to read it again.
>
> 3) draft-gwdi-mil-std-2045-additional-features-v2.05.docx - Material about
> additional DFDL features that would be helpful in modeling MIL-STD-2045.
>
> This can also be reviewed. Not as urgent as (1) above.
>
> 4) draft-gwdi-dfdl-standard-encodings-v03.docx - This is material that
> will be integrated back into the spec as an appendix, but this incorporates
> feedback on prior versions, and adds a 6-bit ascii encoding that is used by
> the same binary format standards as the 7-bit one. Works same way, just 6
> bits not 7, so there's some codepoint changes.
>
> Mike Beckerle | OGF DFDL Workgroup Co-Chair | Tresys Technology |
> *www.tresys.com* <http://www.tresys.com/>
> Please note: Contributions to the DFDL Workgroup's email discussions are
> subject to the *OGF Intellectual Property Policy*
> <http://www.ogf.org/About/abt_policies.php>
>
> --
>  dfdl-wg mailing list
>  *dfdl-wg at ogf.org* <dfdl-wg at ogf.org>
>  *https://www.ogf.org/mailman/listinfo/dfdl-wg*
> <https://www.ogf.org/mailman/listinfo/dfdl-wg>
>
> Unless stated otherwise above:
> IBM United Kingdom Limited - Registered in England and Wales with number
> 741598.
> Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU
>
>
> Unless stated otherwise above:
> IBM United Kingdom Limited - Registered in England and Wales with number
> 741598.
> Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.ogf.org/pipermail/dfdl-wg/attachments/20140801/6847a728/attachment-0001.html>


More information about the dfdl-wg mailing list