[DFDL-WG] New Issue / Discussion topic - Clarification about Literal Nil Values Greedy Parsing

Mike Beckerle mbeckerle.dfdl at gmail.com
Mon Jan 16 18:07:56 EST 2012


 This arose out of discussion of greedy parse behavior for delimiters, but
is clearly distinct from that.

I wrote this language as proposed clarification around the issue of greedy
parsing for literal nil values.

*When nilKind='literalValue' the list of values provided by the nilValue
property is processed in a greedy manner, meaning it takes all the strings
from the whitespace separated list that is the value of the nilValue
property, and sorts them longest first, with length-ties remaining in the
same order as written in the schema. (This is sometimes called a stable
sort - ties aren't re-ordered.)

The processor then attempts to match them against the data stream one by
one. Once a match is found, the literal nil representation is said to be
found, and no other possibilities for the literal nil representation of
that element will be subsequently attempted. (That is, the DFDL processor
will not attempt to try any of the other literal nil representations in
hopes of obtaining a successful parse, should the selected one lead to a
processing error.)

Having matched the literal nil representation, the element may still fail
to parse as a nil value if the matched literal nilValue is not followed by
the appropriate termination as specified by the terminator property and the
nilValueDelimiterPolicy property.
*

I'm not sure this is clear enough, so I offer it as a topic for discussion.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.ogf.org/pipermail/dfdl-wg/attachments/20120116/0722cd4e/attachment-0001.html>


More information about the dfdl-wg mailing list