[DFDL-WG] How to choose the correct choice branch when serializing

Tim Kimber KIMBERT at uk.ibm.com
Fri Apr 13 05:53:42 EDT 2012


There is an interesting edge case which arises when the serializer 
encounters a choice group.
A DFDL xsd is structured as follows:

<root>
    <choice>
        <sequence>
            <firstname/>
            <lastname/>
            <postcode/>
        </sequence>
        <sequence>
            <lastname/>
            <telephoneNumber/>
        </sequence>
    </choice>
</root>

Note that both branches of the choice are sequences, not elements. 

The infoset is 

<root>
    <lastName/>
    <telephoneNumber/>
</root>

The likely action of the serializer is:
- pick the first branch of the choice ( because it contains lastname )
- output the default value of firstname ( assuming that firstname has 
minOccurs = 1 )
- output lastname 
- issue a processing error because lastname is not present in the info set 
( assuming that lastname has minOccurs = 1 ).

...but the user might have intended this: 
- reject the first branch of the choice because there is a missing 
required field
- select the second branch of the choice and successfully process the 
entire info set

The DFDL specification does not state what the behaviour should be. I 
think the options are:
a) state explicitly that the serializer will choose the first branch that 
contains a matching element, regardless of minOccurs
b) invent a new rule that causes the parser to back out of a branch and 
try another branch if there is a minOccurs error while processing the 
branch
c) disallow sequences and choices as immediate children of a choice group

Currently I'm leaning toward a) by process of elimination, for the 
following reasons:
b) would make this scenario work, but I think it would impose a lot of 
work on implementers because it would require the serializer to do 
backtracking.
c) would simplify a lot of things, but I think it's too restrictive - I 
can imagine complex data formats where is might be useful to have a choice 
as the direct child of a choice because the discrimination rules might be 
easier to express in a two-level structure.

regards,

Tim Kimber, Common Transformation Team,
Hursley, UK
Internet:  kimbert at uk.ibm.com
Tel. 01962-816742 
Internal tel. 246742






Unless stated otherwise above:
IBM United Kingdom Limited - Registered in England and Wales with number 
741598. 
Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU





-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.ogf.org/pipermail/dfdl-wg/attachments/20120413/5d3bfa2e/attachment.html>


More information about the dfdl-wg mailing list