[DFDL-WG] unordered sequence with constrained occurrences

Garriss Jr., James P. jgarriss at mitre.org
Tue Mar 5 14:10:39 EST 2013


> The error message is because you don't make forward progress through the data with potentially unbounded occurrences.

I think you just said, “MBTK prevents an infinite loop.”  That makes sense.

>  If there are delimiters then model those and you might not get the error.

I think you just said, “To let MBTK know when it should stop checking, you need a terminator of some sort.”  That also makes sense.  So I added a terminator (%NL;) here:

[cid:image001.png at 01CE19A9.EAD10F60]

Good news:  That fixed the problem, so long as my input is “abc”.

Bad news:  This breaks if the input is any other legal value, such as “abbc” or “cba” or “b”.

The problem for all of these is that my dear friend, checkConstraints, is not implemented yet, thus I can’t prevent the parser from slurping up the wrong character.  I don’t know how anyone can build a non-trivial DFDL schema that involves any sort of choice without this method; I swear, it must be the single most important thing you guys have created for DFDL.

Until checkConstraints is implemented, I’m not really able to test this schema with MBTK.

Thanks so much for your help answering my questions, Steve!


From: Steve Hanson [mailto:smh at uk.ibm.com]
Sent: Tuesday, March 05, 2013 1:46 PM
To: Garriss Jr., James P.
Cc: dfdl-wg at ogf.org; dfdl-wg-bounces at ogf.org
Subject: Re: [DFDL-WG] unordered sequence with constrained occurrences

James,

The error message is because you don't make forward progress through the data with potentially unbounded occurrences. Is this because you are using a cut-down schema?  If there are delimiters then model those and you might not get the error.

Once you have processed the array you can use asserts to check the count. However IBM DFDL does not implement the count functions yet.

Give me a couple of days to look at this more closely. I have a customer visit tomorrow hence the delay.

Regards

Steve Hanson
Architect, Data Format Description Language (DFDL)
Co-Chair, OGF DFDL Working Group<http://www.ogf.org/dfdl/>
IBM SWG, Hursley, UK
smh at uk.ibm.com<mailto:smh at uk.ibm.com>
tel:+44-1962-815848



From:        "Garriss Jr., James P." <jgarriss at mitre.org<mailto:jgarriss at mitre.org>>
To:        "dfdl-wg at ogf.org<mailto:dfdl-wg at ogf.org>" <dfdl-wg at ogf.org<mailto:dfdl-wg at ogf.org>>,
Date:        05/03/2013 16:19
Subject:        Re: [DFDL-WG] unordered sequence with constrained occurrences
Sent by:        dfdl-wg-bounces at ogf.org<mailto:dfdl-wg-bounces at ogf.org>
________________________________



Hmmm, maybe not.  I said:

> The unordered sequence can be modeled with a data array

Yet when implemented in MBTK, it throws a fatal error:

fatal: CTDP3148E: Infinite loop at offset 3: The DFDL parser cannot process array element 'ABCarray' because maxOccurs is unbounded and the length of the previous occurrence was zero.

I think what happens is that on the last pass through the array, it doesn’t find a, b, or c, so it throws a fatal error.

So is this a bug in MBTK?  Or can DFDL not model an unordered sequence?  Or am I just doing it wrong?

Here’s a sample DFDL schemas that illustrates the point:

<?xml version="1.0" encoding="UTF-8"?>
<xsd:schema xmlns:dfdl="http://www.ogf.org/dfdl/dfdl-1.0/"
      xmlns:fmt="http://www.ibm.com/dfdl/GeneralPurposeFormat"
      xmlns:ibmSchExtn="http://www.ibm.com/schema/extensions" xmlns:xsd="http://www.w3.org/2001/XMLSchema">
      <xsd:import namespace="http://www.ibm.com/dfdl/GeneralPurposeFormat"
            schemaLocation="IBMdefined/GeneralPurposeFormat.xsd" />
      <xsd:element ibmSchExtn:docRoot="true" name="ABC">
            <xsd:complexType>
                  <xsd:sequence dfdl:separator="">
                        <xsd:annotation>
                              <xsd:appinfo source="http://www.ogf.org/dfdl/">
                                    <dfdl:sequence />
                              </xsd:appinfo>
                        </xsd:annotation>
                        <xsd:element dfdl:occursCountKind="implicit" maxOccurs="unbounded"
                              minOccurs="1" name="ABCarray">
                              <xsd:complexType>
                                    <xsd:sequence dfdl:separator="">
                                          <xsd:element dfdl:length="1" dfdl:lengthKind="explicit"
                                                dfdl:occursCountKind="implicit" fixed="a" minOccurs="0" name="a"
                                                type="xsd:string" />
                                          <xsd:element dfdl:length="1" dfdl:lengthKind="explicit"
                                                dfdl:occursCountKind="implicit" fixed="b" minOccurs="0" name="b"
                                                type="xsd:string" />
                                          <xsd:element dfdl:length="1" dfdl:lengthKind="explicit"
                                                dfdl:occursCountKind="implicit" fixed="c" minOccurs="0" name="c"
                                                type="xsd:string" />
                                    </xsd:sequence>
                              </xsd:complexType>
                        </xsd:element>
                  </xsd:sequence>
            </xsd:complexType>
      </xsd:element>
      <xsd:annotation>
            <xsd:appinfo source="http://www.ogf.org/dfdl/">
                  <dfdl:format ref="fmt:GeneralPurposeFormat" />
            </xsd:appinfo>
      </xsd:annotation>
</xsd:schema>

Test with “abc” as sample input.

From: Garriss Jr., James P.
Sent: Tuesday, March 05, 2013 8:43 AM
To: dfdl-wg at ogf.org<mailto:dfdl-wg at ogf.org>
Subject: unordered sequence with constrained occurrences

Suppose text data has 3 constructs:  a, b, and c.

•       a must occur 1 time
•       b can occur 0 or 1 time
•       c can occur any number of times, 0 or more

These 3 constructs can appear in any order.

So these are valid inputs:

abc
a
bcccca

But these are not:

ccbcc
abbc
abcabc

Can data like this be modeled with DFDL?

The unordered sequence can be modeled with a data array, like this:

Array (0 to unbounded)
Sequence
  a (0 to 1)
  b (0 to 1)
  c (0 to 1)
/Sequence
/Array

But I don’t know how to constrain the total number of occurrences.

Appreciate any ideas!--
 dfdl-wg mailing list
 dfdl-wg at ogf.org<mailto:dfdl-wg at ogf.org>
 https://www.ogf.org/mailman/listinfo/dfdl-wg

Unless stated otherwise above:
IBM United Kingdom Limited - Registered in England and Wales with number 741598.
Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.ogf.org/pipermail/dfdl-wg/attachments/20130305/f9a31350/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: image001.png
Type: image/png
Size: 13398 bytes
Desc: image001.png
URL: <http://www.ogf.org/pipermail/dfdl-wg/attachments/20130305/f9a31350/attachment-0001.png>


More information about the dfdl-wg mailing list