[DFDL-WG] bit order documents updated

Steve Hanson smh at uk.ibm.com
Fri Aug 1 13:05:12 EDT 2014


I am happy to leave byteOrder where it is in section 11.  It's a 
fundamental concept even if it only applies to binary reps.

But it sounds like bitOrder is a property of all object types, not just 
binary reps. Which means case it applies to text reps and hence is a peer 
of, and not subsumed by, an encoding. Doesn't that mean that bit order 
should not be in the DFDL standard encoding template, as its value is 
going to be supplied by the property?

We have to be very careful that the whole of the spec makes sense in an 
LSBF world, and not just binary reps.

Regards
 
Steve Hanson
Architect, IBM DFDL
Co-Chair, OGF DFDL Working Group
IBM SWG, Hursley, UK
smh at uk.ibm.com
tel:+44-1962-815848



From:   Mike Beckerle <mbeckerle.dfdl at gmail.com>
To:     Steve Hanson/UK/IBM at IBMGB, 
Cc:     "dfdl-wg at ogf.org" <dfdl-wg at ogf.org>
Date:   01/08/2014 17:46
Subject:        Re: [DFDL-WG] bit order documents updated




I agree that the most significant issue about bit order is that it isn't 
really analogous at all to byte order. 

Bit order is fundamentally more primitive.

I think I got fooled into thinking bit-order and byte-order were similar, 
because we actually have byte-order incorrectly classified. It is in the 
section of properties about both content and framing, and says it applies 
to sequence, choice, and group.

But if byte order is really about simple content only then it is not 
relevant to sequence/choice/group. That is probably a hold-over from when 
we had byte-order affecting text for some encodings. Now that byte-order 
of text is specified in the encoding only, perhaps byte order is no longer 
relevant to sequence, choice, and group. And if that is so, then, I think 
we have byte order in the wrong section. Byte order is in the section on 
properties about both content and framing, but if it is only about simple 
content it belongs in a subsection of section 13. We don't have a section 
about "all simple types with binary representation", but that's where it 
would logically want to sit since it is shared by binary numbers, binary 
calendars, and binary booleans. (Also note that in our enumeration of the 
binary types in the description for the dfdl:byteOrder property, we forgot 
binary booleans. We need the byte order if we're to compare the 
binaryBooleanTrue/FalseRep to a value coming from the data stream and the 
number of bits of length is > 8.)

But back to bit order:

Bit order, unlike byte order, truly does affect both content and framing. 
It affects the start position of anything that can start at a location 
that is not on a byte boundary. From the perspective of our data grammar, 
this really does mean any grammar region at all. Bit order is needed to 
define bit position, and bit position is needed to define the start and 
end of grammar regions because grammar regions start and end at specific 
bit positions.

So implications are that anything with alignment must take bit order into 
consideration, therefore sequence, choice, group as well as element and 
simple type. Furthermore, anything with length must take bit-order into 
consideration because bit order dictates how you are numbering the bit 
positions and therefore exactly which bits are between the start bit 
position and end bit position depend on bit order whenever either the 
start bit position or end bit position are not a byte boundary. 

Perhaps we need to state exactly this. "A grammar region defines a part of 
the data stream starting at a specific bit position and ending at a 
specific bit position. Since bit positions within a byte are numbered 
relative to the bit order, the specific bits of the data stream 
corresponding to any grammar region require consideration of the bit order 
any time the start bit position or end bit position of the region are not 
on a byte boundary."

I should mention that implementation wise this is much easier to implement 
than it sounds (for parsing anyway), because it is only when you actually 
extract data from the data stream that you must consider the bit order, 
and use it if it is LSB first, to change around the extracted data so as 
to access the right bits. Parsing things like alignment fill, that just 
move counters of positions, really there is nothing to do there you just 
increase counters of bit position. Of course on unparsing, alignmentFill 
is equivalent to writing out a binary integer of a specific length, so on 
unparsing the bit-order does matter for regions that must be filled. 

So bit order is not really analogous to byte order at all. It's a more 
primitive thing and belongs in Section 11. 

It is only when things are byte-aligned and a multiple of bytes long that 
bit order doesn't matter. 

That happens a lot in real data, but in DFDL everything allows for the 
possibility that things are not so aligned and don't have bytes-long 
lengths. The only things I know of that must be byte aligned are text in 
encodings that have a mandatory byte alignment. I think everything else 
can be bit aligned. Hence, bit order applies to prefix length regions, and 
prefix-prefix length regions, element unused regions, choice unused 
regions, alignment fill, skip regions, etc. It applies to anything because 
it affects the notion of "position" and regions are about the bits between 
the start position of the region and the end position of a region. 

Bit order and bit-position are more primitive notions on top of which the 
notion of regions in the grammar are built. 

This is actually a good thing, because otherwise bit order would require 
detailed treatment in practically every section of the DFDL spec. Since 
bit order underlies bit position, which underlies regions, we can define 
how bit order affects bit position, and then the only additional place bit 
order needs discussion is where it affects the values of things - decoding 
of text characters, and computation of binary simple type values. To the 
extent that descriptions of simple type conversions use first and last or 
most-significant and least-significant, there is no change needed because 
these notions are relative to bit position. 

...mikeb







Mike Beckerle | OGF DFDL Workgroup Co-Chair | Tresys Technology | 
www.tresys.com
Please note: Contributions to the DFDL Workgroup's email discussions are 
subject to the OGF Intellectual Property Policy



On Fri, Aug 1, 2014 at 8:00 AM, Steve Hanson <smh at uk.ibm.com> wrote:
Mike 

Attached is a review copy of the bit order document with Word comments 
added.  I am particularly concerned about the grammar region implications 
that you have added. We have not discussed this before on the calls. 



Regards
 
Steve Hanson
Architect, IBM DFDL
Co-Chair, OGF DFDL Working Group
IBM SWG, Hursley, UK
smh at uk.ibm.com
tel:+44-1962-815848 



From:        Mike Beckerle <mbeckerle.dfdl at gmail.com> 
To:        Steve Hanson/UK/IBM at IBMGB, "dfdl-wg at ogf.org" <dfdl-wg at ogf.org>, 

Date:        25/07/2014 14:45 
Subject:        Re: [DFDL-WG] bit order documents updated 



I will edit and share a discussion thread of the link16 requirements I 
have been having with Russ Hawkins of Lockheed-Martin discussing 
TDL/link16. He is a SME on this topic, having built and used processors 
for that format. 

MIL-STD-2045 works today with just the leastSignificantBitFirst feature as 
described. We have some tests. I will publish this DFDL schema and tests. 

MIL-STD-2045 doesn't require the byte order mixed-endian behavior. 

Mixed endian behavior is something we originally thought link16 required 
that is not illustrated within MIL-STD-2045. 

Our understanding of link16 was incorrect however. It turns out link16 
only requires this byte/word swapping as a bulk preprocessing pass over 
the entire body of a message (but not the message header). This is not 
unlike decoding base64 or other encoded data found in a message payload. 
DFDL can't do this today in one pass - an application program that drives 
a DFDL processor is needed for multi-pass formats like this. This is fine 
for DFDL v1.0.  

Russ verified that this is the *only* byte/word swapping required for 
link16 - there is no per-element mixed-endian byte order required. Just 
the same least-significant-bit-first bitOrder that MIL-STD-2045 needs. 
Hence, I dropped the mixed-endian byte order from the proposal.

The remaining link16 requirement that couldn't be worked around is the 
6-bit ascii format. 

It is always going to be the case that for some formats, especially these 
big complex ones, that we will not really know if we can model it until 
the DFDL schema has been created. However, MIL-STD-2045 plus a minor 
additional 6-bit encoding is our best shot at having credible support for 
a wide array of MIL-STD formats. 


Mike Beckerle | OGF DFDL Workgroup Co-Chair | Tresys Technology | 
www.tresys.com 
Please note: Contributions to the DFDL Workgroup's email discussions are 
subject to the OGF Intellectual Property Policy 



On Fri, Jul 25, 2014 at 6:02 AM, Steve Hanson <smh at uk.ibm.com> wrote: 
I am very concerned that we do not understand these formats properly. We 
need to engage an SME to verify that these proposals are enough to model 
MIL-STD-2045 and Link-16 (and other related formats). Until that happens 
IBM will not be able to close action 233. 

Regards
 
Steve Hanson
Architect, IBM DFDL
Co-Chair, OGF DFDL Working Group
IBM SWG, Hursley, UK
smh at uk.ibm.com
tel:+44-1962-815848 



From:        Mike Beckerle <mbeckerle.dfdl at gmail.com> 
To:        "dfdl-wg at ogf.org" <dfdl-wg at ogf.org>, 
Date:        24/07/2014 23:08 
Subject:        Re: [DFDL-WG] bit order documents updated 
Sent by:        dfdl-wg-bounces at ogf.org 



Of course as soon as I post this I get the information needed to resolve 
several things. 

The new byte-order enum is not needed, so I removed that material. The bit 
order document is now a fairly clean thing trying to be an errata.

Please review: draft-gwdi-bit-order-features-v3.0.docx

I created a separate working document on "Mixed Endian Byte Order" should 
we ever take this up again.

I also removed the TDML stuff into the TDML document. (now v2). 


Mike Beckerle | OGF DFDL Workgroup Co-Chair | Tresys Technology | 
www.tresys.com 
Please note: Contributions to the DFDL Workgroup's email discussions are 
subject to the OGF Intellectual Property Policy 



On Thu, Jul 24, 2014 at 5:16 PM, Mike Beckerle <mbeckerle.dfdl at gmail.com> 
wrote: 

There are now several working documents having to do with bit-order. The 
prior document has been taken down and replaced by 3 others.

These are all up on redmine: 
http://redmine.ogf.org/dmsf/dfdl-wg?folder_id=5485 

1) draft-gwdi-bit-order-features-v2.05.docx - Just describes the new 
features + TDML extension. The biggest section is organized like an errata 
specifying where things change in the spec. This should get incorporated 
into errata, or perhaps just referenced from an errata. 

Section 3 of this doc is the proposed errata and spec language for review. 


There is one major open issue here, which is it is unclear whether the 
mixed-endian byte order that was previously proposed is actually needed or 
not. The swapping of 16-bit words appears to be not a per-element thing, 
but something done as a pre-processing of an entire message body before 
parsing. This isn't something we can handle in DFDL, much like 
base64-encoded data and so forth. This may be the only place that needed 
the 16-bit word swapping. 

2) Understanding Bit Orderings - 
draft-gwdi-mil-std-2045-understanding-bit-order-v2.05.docx - Material 
about bit order - the Wire model vs. Number model material. This is 
effectively just archiving this material for posterity.

If you already read this, you don't need to read it again. 

3) draft-gwdi-mil-std-2045-additional-features-v2.05.docx - Material about 
additional DFDL features that would be helpful in modeling MIL-STD-2045.

This can also be reviewed. Not as urgent as (1) above. 

4) draft-gwdi-dfdl-standard-encodings-v03.docx - This is material that 
will be integrated back into the spec as an appendix, but this 
incorporates feedback on prior versions, and adds a 6-bit ascii encoding 
that is used by the same binary format standards as the 7-bit one. Works 
same way, just 6 bits not 7, so there's some codepoint changes.

Mike Beckerle | OGF DFDL Workgroup Co-Chair | Tresys Technology | 
www.tresys.com 
Please note: Contributions to the DFDL Workgroup's email discussions are 
subject to the OGF Intellectual Property Policy 

--
 dfdl-wg mailing list
 dfdl-wg at ogf.org
 https://www.ogf.org/mailman/listinfo/dfdl-wg 

Unless stated otherwise above:
IBM United Kingdom Limited - Registered in England and Wales with number 
741598. 
Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU 



Unless stated otherwise above:
IBM United Kingdom Limited - Registered in England and Wales with number 
741598. 
Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU


Unless stated otherwise above:
IBM United Kingdom Limited - Registered in England and Wales with number 
741598. 
Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.ogf.org/pipermail/dfdl-wg/attachments/20140801/b3915868/attachment-0001.html>


More information about the dfdl-wg mailing list