[DFDL-WG] can DFDL model this? (initiators, but no separators or terminators, plus optional elements)

Mike Beckerle mbeckerle.dfdl at gmail.com
Tue Mar 5 15:06:15 EST 2013


This is what lengthKind='pattern' is for. To give you the ability to use a
regex with non-capturing lookahead.

On Tue, Mar 5, 2013 at 2:52 PM, Garriss Jr., James P. <jgarriss at mitre.org>wrote:

>  Suppose I have this input data:****
>
> ** **
>
>   FirstName James LastName Garriss Hometown Raleigh Company The MITRE
> Corporation CRLF****
>
> ** **
>
> To the human eye, this is simple.  We have four elements, each of which
> has an initiator.  But to make things more interesting:****
>
> ** **
>
> **1.     **The elements are all strings, and they do not have fixed
> lengths, set values, or any other terminator.  The only way you know them
> apart is by the initiator.  (And this implies that the initiators cannot be
> part of the elements.)****
>
> **2.     **There are no separators (spaces can be in the data).****
>
> **3.     **The third and fourth elements are optional.****
>
> ** **
>
> So these are both valid data:****
>
> ** **
>
>   FirstName John Mark LastName Smith ****
>
>   FirstName Bob LastName Brown Company IBM****
>
> ** **
>
> How do we model this?****
>
> ** **
>
> Attempt #1:****
>
> ** **
>
> I have four elements each with a unique initiator (FirstName, LastName,
> Hometown, Company).  The problem is that there’s no way to know when the
> first element terminates, so everything after the “FirstName” initiator
> ends up in the FirstName element.  Oops.****
>
> ** **
>
> Attempt #2:****
>
> ** **
>
> I got funky with the terminators.  The first element has LastName as a
> terminator.  The second element has Hometown or Company as an element.  The
> third element has Company or %NL; as an element.  And the fourth one uses
> %NL;.  Works great, unless the optional third element isn’t there.  IOW, if
> I have this input:****
>
> ** **
>
>   FirstName Bob LastName Brown Company IBM****
>
> ** **
>
> Then “IBM” winds up in Hometown element.  Oops.****
>
> ** **
>
> So, what to do?  I don’t know.  I don’t know how to solve this.  Hopefully
> you’re going to teach me about some feature I don’t yet know.  ****
>
> ** **
>
> If not, then I have a potential solution, an addition to the spec.  Add
> this option as a terminator:  “This element terminates when you find the
> initiator to the next element.”  That’s probably easier said than done, but
> it seems to make sense in this context.****
>
> --
>   dfdl-wg mailing list
>   dfdl-wg at ogf.org
>   https://www.ogf.org/mailman/listinfo/dfdl-wg
>



-- 
Mike Beckerle | OGF DFDL Workgroup Co-Chair | Tresys Technology |
www.tresys.com
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.ogf.org/pipermail/dfdl-wg/attachments/20130305/a924e8a6/attachment.html>


More information about the dfdl-wg mailing list