[DFDL-WG] Floating elements and unordered groups

DFDL mbeckerle.dfdl at gmail.com
Mon Jan 11 18:16:24 CST 2010


The worry about allowing absolutely anything and letting speculative  
parsing sort it out is that it will be easy for users to hang  
themselves; that us get very complex and inscrutable behaviors. Some  
redundancy is valuable. Requiring users to specify elements as  
children of unordered sequences to make the diagnostic messages  
simpler also dodges the whole issue of whether a choice can be the  
child of an unordered sequence.

Ultimately, you can model anything along these lines as an array of  
choice, so the question is what simpler cases do we make easier to  
express.

For example why don't we rule out the cases where there are N floating  
children out of M total children for N >1 ? This just forces you to  
cross the line over to an unordered sequence if you have something  
where there is more than one kind of thing floating around.

This whole issue  really stems from fear of complexity.  When  
something common and simple is expressed I would like a simple dfdl  
implementation to be able to give users clear diagnostics when data is  
broken.

The point of unordered initiated was that if that is in fact a common  
case the diagnostics would be very simple. If it's not a common enough  
case then of course we should drop it. But the fact that speculative  
parsing has to deal with this as a special case of even more complex  
things isn't a good argument for dropping it to me.

The perspective I am trying to convey is this: speculative parsing is  
a very big hammer. Many keywords could be dropped from dfdl if we lean  
on speculation everywhere. That doesn't mean we should do so. To me  
most data formats shouldn't require any speculation and it should be  
easy to work in the subset of dfdl where speculation plays no role.

...mikeb


On Jan 11, 2010, at 6:37 PM, Stephanie Fetzer <sfetzer at us.ibm.com>  
wrote:

>
> All:
>
> I remember a bit of a conversation on this topic from a while back -  
> one of the things we were addressing was multi-level  unorderedness.  
> An unordered group (or an ordered group with floating components)  
> within an unordered group (or an ordered group with floating  
> components).  This is allowed in WTX and we wanted to make sure that  
> it was specifically allowed in DFDL.  We do no restrict unordered  
> groups at all in this respect.
>
> In V34 the DFDL spec contained the concept of "unordered" versus  
> "unorderedInitiated" but that was removed ( with v35) and the  
> 'children must be xs:element' phrase was added).
>
> The other goal of the wording in the spec was to make sure that an  
> unordered group was parsed/serialized  the exact same way as an  
> ordered group with ALL unordered components.
> If we have a group with n-1 floating components then that is really  
> the functional equivalent of having n floating components.  The one  
> 'static' component will not anchor anything and all will still be  
> unordered. If we had n-2 floating components then we would in fact  
> have two components that would need to be in the same order relative  
> to each other.
>
> So from that perspective - the wording in 16.5 looks correct to me:
> An ordered sequence of n element children with either n or n-1 of  
> those children with dfdl:floating="true" is equivalent to an  
> unordered sequence with the same n element children with  
> dfdl:floating="false".
>
> Is the question then - why the wording is Section 16?... "The  
> children of an unordered sequence must be xs:element."
> If so, I did not read that as any type of limitation on the contents  
> of the element..I can have an element which contains a group of  
> sequences containing groups. (or have I misunderstood what this is  
> trying to convey?).  Perhaps that phrase isn't really saying  
> anything useful at this point and should be removed.  I don't  
> believe that we want to go back to the unorderedInitiated concept  
> (where we had a different set of rules for unordered groups if the  
> content was all initiated).
>
> -My take is that we should consider removing or further explain the  
> "The children of an unordered sequence must be xs:element." .
> -The other note with the "An ordered sequence of n element children  
> with either n or n-1 of those children with dfdl:floating="true" is  
> equivalent to an unordered sequence with the same n element children  
> with dfdl:floating="false". - looks fine as is to me as far as I can  
> see.
> -I'd prefer we not reopen the unorderedInitiated concept again.
>
> Cheers,
>
>
> Stephanie Fetzer
> WebSphere Common Transformation
> Industry Packs - Software Engineer
>
>
>
> From:	Tim Kimber <KIMBERT at uk.ibm.com>
> To:	dfdl-wg at ogf.org
> Date:	01/11/2010 12:52 PM
> Subject: 	[DFDL-WG] Floating elements and unordered groups
> Sent by:	dfdl-wg-bounces at ogf.org
>
>
>
>
>
> Hi all,
>
> I know this area of the specification was only recently resolved,  
> and I think there may be an inconsistency in the v0.37 wording.
>
> Section 16, re: sequenceKind says: "The children of an unordered  
> sequence must be xs:element."
> Section 16.5 Floating Elements says: "An ordered sequence of n  
> element children with either n or n-1 of those children with  
> dfdl:floating="true" is equivalent to an unordered sequence with the  
> same n element children with dfdl:floating="false". A complex  
> element with dfdl:floating="true" can have as its content model a  
> sequence with elements that also have dfdl:floating="true". "
>
> Now suppose that, instead of N element children, there are N-1  
> floating element children + one non-floating group. This group will  
> be equivalent to an unordered group with a non-element member.
> If the specification was intending to make life easy for  
> implementers, then it should probably disallow groups in any non- 
> ordered context, including when sequenceKind='ordered' and there is  
> at least one floating component. But I think that would be too  
> restrictive. I would be happy for the restriction to be lifted  
> entirely. Given that unordered groups can have  
> dfdl:initiated="false", it will sometimes be necessary to find the  
> correct member by trial and error ( speculative parsing ) anyway. I  
> don't think it's any more difficult to speculatively parse a group  
> than to speculatively parse a complex element.
>
> If I've missed something, and it turns out that the restriction is  
> useful, then we should
> a) tighten up the wording to say that if a group with N members has  
> N or N-1 floating members, then it must be validated as if it was an  
> unordered group.
> b) consider lifting the restriction in cases where  
> dfdl:initiated="true" ( because it makes things so much easier for  
> the DFDL processor )
>
> regards,
>
> Tim Kimber, Common Transformation Team,
> Hursley, UK
> Internet:  kimbert at uk.ibm.com
> Tel. 01962-816742
> Internal tel. 246742
>
>
>
>
> Unless stated otherwise above:
> IBM United Kingdom Limited - Registered in England and Wales with  
> number 741598.
> Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire  
> PO6 3AU
>
>
>
>
>
> --
>  dfdl-wg mailing list
>  dfdl-wg at ogf.org
>  http://www.ogf.org/mailman/listinfo/dfdl-wg
>
> --
>  dfdl-wg mailing list
>  dfdl-wg at ogf.org
>  http://www.ogf.org/mailman/listinfo/dfdl-wg
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.ogf.org/pipermail/dfdl-wg/attachments/20100111/54242095/attachment-0001.html 


More information about the dfdl-wg mailing list