[DFDL-WG] Nested Delimited Constructs - For discussion Wed 3/28
Mike Beckerle
beckerle at us.ibm.com
Tue Mar 27 19:15:59 CDT 2007
Quick edits to improve correctness.
-------------------------------------------------------------------------------------------
Nested Delimited Constructs
There are two kinds of terminating delimiters:
postfix separators (TBD: currently only on arrays)
terminators
Both of these can be optional. (TBD: can postfix separators be optional? I
believe there is no finalSeparatorCanBeMissing though there is a
terminatorCanBeMissing property)
This means the parser can encounter a separating or terminating delimiter
from an enclosing array or group to indicate the termination of a nested
array or group.
The behavior of parsing when it encounters one of these delimiters, that
is, one that is defined on an enclosing construct, is to indicate that it
found a delimiter of some sort, but to not consume that delimiter when
computing the new pos result. That is, the parse function succeeds and
returns a new pos, value, etc., but the pos reflects the position where
the delimiter begins.
The invariant that this insures is that a parser for any construct
consumes its own delimiters and only its own delimiiters.
For example, the parse function for a sequence group which has separators
specified will recursively parse the elements it contains; however, if one
of those elements' representation is terminated by finding the enclosing
sequence group's separator, then that separator will not be consumed, and
when the recursive parse unwinds back to the parse function of the
enclosing sequence group, the separator will then be consumed by the
sequence group's parse function which is prepared to recognize it and
advance past it.
This principle works regardless of how deeply nested the constructs are.
Parsing must, however, take into account the complete set of delimiters
that it might encounter, along with the escape/quoting schemes that can be
specified for them which allow them to appear as content rather than as
delimiters.
Mike Beckerle/Worcester/IBM at IBMUS
Sent by: dfdl-wg-bounces at ogf.org
03/27/2007 08:01 PM
To
dfdl-wg at ogf.org
cc
Subject
[DFDL-WG] Nested Delimited Constructs - For discussion Wed 3/28
I believe the language below, while needing examples perhaps to clarify,
solves the problem of how to deal with nested delimited constructs where
outer delimiters are terminating inner nested constructs.
I'd like to see if people understand it on our DFDL call on wednesday.
...mikeb
Nested Delimited Constructs
There are two kinds of terminating delimiters:
postfix separators (TBD: currently only on arrays)
terminators
Both of these can be optional. (TBD: can postfix separators be optional? I
believe there is no finalSeparatorCanBeMissing though there is a
terminatorCanBeMissing property)
This means the parser can encounter a terminating delimiter from an
enclosing array or group to indicate the termination of a nested array or
group.
The behavior of parsing when it encounters one of these terminating
delimiters, that is, one that is defined on an enclosing construct, is to
indicate that it found a terminating delimiter of some sort, but to not
consume that terminating delimiter when computing the new pos result. That
is, the parse function succeeds and returns a new pos, value, etc., but
the pos reflects the position where the terminating delimiter begins.
The invariant that this insures is that a parser for any construct
consumes its own delimiters and only its own delimiiters.
For example, the parse function for a sequence group which has separators
specified will recursively parse the elements it contains; however, if one
of those elements' representation is terminated by finding the enclosing
sequence group's separator, then that separator will not be consumed, and
when the recursive parse unwinds back to the parse function of the
enclosing sequence group, the separator will then be consumed by the
sequence group's parse function which is prepared to recognize it and
advance past it.
This principle works regardless of how deeply nested the constructs are.
Parsing must, however, take into account the complete set of terminating
delimiters that it might encounter, along with the escape/quoting schemes
that can be specified for them which allow them to appear as content
rather than as delimiters.
--
dfdl-wg mailing list
dfdl-wg at ogf.org
http://www.ogf.org/mailman/listinfo/dfdl-wg
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.ogf.org/pipermail/dfdl-wg/attachments/20070327/ebecba39/attachment.html
More information about the dfdl-wg
mailing list