[DFDL-WG] Grammar as adapted to infoset

Steve Hanson smh at uk.ibm.com
Wed Nov 21 06:41:59 CST 2007


Allowing dfdl:lengthKind on choices has other benefits. It means that 
elements, sequences and choices are consistent in how they are capable of 
being extracted (literal length, XPath, model determined, prefixed, null 
terminated). 

I don't think unresolvableWhenParsing implies fixed length storage. I 
could have an unresolvable choice being two variable length strings, 
terminated by the terminator of the choice. 

Regards, Steve

Steve Hanson
WebSphere Message Brokers
Hursley, UK
Internet: smh at uk.ibm.com
Phone (+44)/(0) 1962-815848



Mike Beckerle/Worcester/IBM at IBMUS
20/11/2007 21:23

To
Steve Hanson/UK/IBM at IBMGB
cc
dfdl-wg at ogf.org
Subject
Re: [DFDL-WG] Grammar as adapted to infoset





Rather than overloading "implicit" length to mean this, I think we need an 
enum-valued property for choices. 

Today there is this choiceUnresolvableWhenParsing boolean. I think this 
shoudl become choiceType="unresolvableWhenParsing".

Then we can have an enum value for the case where all arms of  the choice 
are assumed to be padded to max length of the alternative. 

So I think we have 3 enum values:

variableLength
fixedLength
unresolvableWhenParsing

variableLength means the storage taken depends on the alternative 
selected, Fixed length means shorter alternatives are padded to the length 
of the longest. unresolvable means what it means now, which is that you 
can't tell which variant it is (which also implies the storage required is 
fixed length).

...mikeb

Mike Beckerle
STSM, Architect, Scalable Computing
IBM Software Group
Information Platform and Solutions
Westborough, MA 01581
direct: voice and FAX 508-599-7148
assistant: Pam Riordan 
                  priordan at us.ibm.com 
                  508-599-7046





Steve Hanson/UK/IBM at IBMGB
11/19/2007 07:13 AM

To
Mike Beckerle/Worcester/IBM at IBMUS
cc
dfdl-wg at ogf.org
Subject
Re: [DFDL-WG] Grammar as adapted to infoset





Or allow xs:choice to carry dfdl:length & dfdl:lengthKind.  Then 
lengthKind="implicit" could mean the length is the longest of the modelled 
alternatives.

Regards, Steve

Steve Hanson
WebSphere Message Brokers
Hursley, UK
Internet: smh at uk.ibm.com
Phone (+44)/(0) 1962-815848



Mike Beckerle/Worcester/IBM at IBMUS
16/11/2007 19:29

To
Steve Hanson/UK/IBM at IBMGB
cc
dfdl-wg at ogf.org
Subject
Re: [DFDL-WG] Grammar as adapted to infoset





There are of course alternatives. 

E.g., a hidden or non-hidden field which is a byte array or hexBinary to 
absorb the padding bytes at the ends of the shorter variants. Or one could 
put trailing skip bytes on the last element  or group of each of the 
shorter variants. 

One can also encapsulate the whole choice in a sequence whose length is 
the fixed maximum length. This is my personal favorite, since it uses the 
FinalUnused that is already in the grammar.

E.g., suppose the length of the longest variant is 258 bytes.

<sequence dfdl:lengthKind="explicit" dfdl:length="258" 
dfdl:lengthUnits="bytes" dfdl:applies="hereOnly">
    <choice>
         .....
   </choice>
</sequence>

The only thing I don't like about this is I don't have a way to express 
the length other than to hard-code the constant 258, but since tooling 
would typically put this in based on Cobol descriptors or other 
information I don't mind so much. 

...mikeb


Mike Beckerle
STSM, Architect, Scalable Computing
IBM Software Group
Information Platform and Solutions
Westborough, MA 01581
direct: voice and FAX 508-599-7148
assistant: Pam Riordan 
                  priordan at us.ibm.com 
                  508-599-7046





Steve Hanson/UK/IBM at IBMGB
11/16/2007 11:24 AM

To
Mike Beckerle/Worcester/IBM
cc
dfdl-wg at ogf.org
Subject
Re: [DFDL-WG] Grammar as adapted to infoset





Hi Mike

One extra thought post-call, caused by thinking about the FinalUnused part 
of a sequence. Some fixed length choices have a rule that each branch of 
the choice must be of equal length. This results in unused bytes for 
branches shorter than the maximum.  In a bitstream instance, a user will 
see these bytes. Perhaps the grammar should acknowledge their existence, 
even if there are no DFDL properties to express the bytes (like 
FinalUnused)?

Regards, Steve

Steve Hanson
WebSphere Message Brokers
Hursley, UK
Internet: smh at uk.ibm.com
Phone (+44)/(0) 1962-815848



Mike Beckerle <beckerle at us.ibm.com> 
Sent by: dfdl-wg-bounces at ogf.org
10/11/2007 00:16

To
dfdl-wg at ogf.org
cc

Subject
[DFDL-WG] Grammar as adapted to infoset







I took a crack at this. I ripped it out of the main document for faster 
review/turnaround. 

I think this matches the latest infoset email diagram. 



Mike Beckerle
STSM, Architect, Scalable Computing
IBM Software Group
Information Platform and Solutions
Westborough, MA 01581
direct: voice and FAX 508-599-7148
assistant: Pam Riordan 
                 priordan at us.ibm.com 
                 508-599-7046
--
  dfdl-wg mailing list
  dfdl-wg at ogf.org
  http://www.ogf.org/mailman/listinfo/dfdl-wg










Unless stated otherwise above:
IBM United Kingdom Limited - Registered in England and Wales with number 
741598. 
Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU





-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.ogf.org/pipermail/dfdl-wg/attachments/20071121/cd31bef0/attachment-0001.html 
-------------- next part --------------
A non-text attachment was scrubbed...
Name: dfdl-grammar.doc
Type: application/octet-stream
Size: 307712 bytes
Desc: not available
Url : http://www.ogf.org/pipermail/dfdl-wg/attachments/20071121/cd31bef0/attachment-0001.obj 


More information about the dfdl-wg mailing list