[DFDL-WG] Action 276: Erratum 2.100 (was Action 183 - chicken-and-egg situation with lengths given by expressions)

Mike Beckerle mbeckerle.dfdl at gmail.com
Thu Jan 29 16:32:47 EST 2015


In general I support the notion here that we evaluate the expressions when
unparsing.

I added a couple small wording changes and suggestions. Search for my
comment bubbles.

Mike Beckerle | OGF DFDL Workgroup Co-Chair | Tresys Technology |
www.tresys.com
Please note: Contributions to the DFDL Workgroup's email discussions are
subject to the OGF Intellectual Property Policy
<http://www.ogf.org/About/abt_policies.php>


On Mon, Jan 19, 2015 at 1:06 PM, Steve Hanson <smh at uk.ibm.com> wrote:

> Please have a position on the below proposal from IBM for this week's WG
> call.
>
> Regards
>
> Steve Hanson
> Architect, *IBM DFDL*
> <http://www.ibm.com/developerworks/library/se-dfdl/index.html>
> Co-Chair, *OGF DFDL Working Group* <http://www.ogf.org/dfdl/>
> IBM SWG, Hursley, UK
> *smh at uk.ibm.com* <smh at uk.ibm.com>
> tel:+44-1962-815848
>
>
>
> From:        Steve Hanson/UK/IBM
> To:        DFDL-WG <dfdl-wg at ogf.org>
> Date:        12/01/2015 18:14
> Subject:        Fw: [DFDL-WG] Erratum 2.100 (was Action 183 -
> chicken-and-egg situation with lengths given by expressions)
> ------------------------------
>
>
> After internal discussion, IBM proposes the following:
>
>    - No new dfdl:lengthKind enums
>    -
>    - dfdl:lengthKind 'explicit' is always considered to be 'fixed length'
>     - When dfdl:length is an expression, it is always evaluated when
>    unparsing
>    -
>    - If the intent is for the infoset data of the element to provide the
>    length, the integer element that is the target of the expression should use
>    dfdl:outputValueCalc with function dfdl:valueLength() referring forward to
>    the infoset element. (Use of dfdl:contentLength() is a schema definition
>    error as it can cause deadlock).
>    -
>    - It should be noted that dfdl:valueLength() returns an unpadded
>    length. This does *not* take into account the XSD minLength or
>    dfdl:textOutputMinLength properties.
>    -
>    - The net is that for simple text elements with dfdl:textPadKind
>    'padChar', 'delimited', 'pattern', 'prefixed' are considered variable
>    length when unparsing and are padded to a minimum length, and 'implicit',
>    'explicit', 'endOfParent' have a target length and are padded to that
>    length.
>    -
>    - Similar for xs:hexBinary elements, 'delimited', 'pattern',
>    'prefixed' are always padded to a minimum length, and 'implicit',
>    'explicit', 'endOfParent' are always padded to target length.
>    -
>    - So for a text element with dfdl:textPadKind 'padChar', or a
>    xs:hexBinary element, and its unpaddedLength is less than the minimum,
>    'explicit' (expression) is not identical in behaviour to 'prefixed'.
>    -
>    - If the last point is thought to be a problem, options are to allow
>    dfdl:valueLength() to return minimum length or actual length depending on a
>    new argument, or create a new function to return minimum length. This can
>    be done under existing deferred action 242.
>    -
>
> To verify that these changes fit into the spec ok, I have updated it -
> change tracking on and comment for each change.
>
>
>
> Regards
>
> Steve Hanson
> Architect, *IBM DFDL*
> <http://www.ibm.com/developerworks/library/se-dfdl/index.html>
> Co-Chair, *OGF DFDL Working Group* <http://www.ogf.org/dfdl/>
> IBM SWG, Hursley, UK
> *smh at uk.ibm.com* <smh at uk.ibm.com>
> tel:+44-1962-815848
> ----- Forwarded by Steve Hanson/UK/IBM on 12/01/2015 10:20 -----
>
> From:        Steve Hanson/UK/IBM
> To:        Tim Kimber/UK/IBM at IBMGB
> Cc:        "Mike Beckerle" <mbeckerle.dfdl at gmail.com>, DFDL-WG <
> dfdl-wg at ogf.org>
> Date:        06/01/2015 15:54
> Subject:        Re: [DFDL-WG] Erratum 2.100 (was Action 183 -
> chicken-and-egg situation with lengths given by expressions)
> ------------------------------
>
>
> A thought.
>
> We could add dfdl:lengthKind 'expression', use dfdl:length for the
> expression, and add a new enum property that is the policy for the
> expression - that is, evaluate on parse only, or evaluate on both parse and
> unparse.  The enum for the latter is the equivalent of the pre-erratum
> 2.100 'explicit' and expression behaviour as implemented by IBM DFDL today
> - which we would then deprecate in the spec (or even remove) and deprecate
> in IBM DFDL.
>
> Regards
>
> Steve Hanson
> Architect, *IBM DFDL*
> <http://www.ibm.com/developerworks/library/se-dfdl/index.html>
> Co-Chair, *OGF DFDL Working Group* <http://www.ogf.org/dfdl/>
> IBM SWG, Hursley, UK
> *smh at uk.ibm.com* <smh at uk.ibm.com>
> tel:+44-1962-815848
>
>
>
>
> From:        Steve Hanson/UK/IBM
> To:        Tim Kimber/UK/IBM at IBMGB, "Mike Beckerle" <
> mbeckerle.dfdl at gmail.com>
> Cc:        DFDL-WG <dfdl-wg at ogf.org>
> Date:        10/12/2014 09:08
> Subject:        Re: [DFDL-WG] Erratum 2.100 (was Action 183 -
> chicken-and-egg situation with lengths given by expressions)
> ------------------------------
>
>
> Tim & Mike - replies in-line.
>
> Concensus is heading towards 'explicit' working as per IBM DFDL today, so
> we need to decide on the best way to express the 2.100 behaviour.
>
> Regards
>
> Steve Hanson
> Architect, *IBM DFDL*
> <http://www.ibm.com/developerworks/library/se-dfdl/index.html>
> Co-Chair, *OGF DFDL Working Group* <http://www.ogf.org/dfdl/>
> IBM SWG, Hursley, UK
> *smh at uk.ibm.com* <smh at uk.ibm.com>
> tel:+44-1962-815848
>
>
>
>
> From:        Tim Kimber/UK/IBM at IBMGB
> To:        dfdl-wg at ogf.org
> Date:        09/12/2014 23:25
> Subject:        Re: [DFDL-WG] Erratum 2.100 (was Action 183 -
> chicken-and-egg situation with lengths given by expressions)
> Sent by:        dfdl-wg-bounces at ogf.org
> ------------------------------
>
>
>
> My 2 cents:
>
> lengthKind 'explicit' should continue to work as it currently does in the
> IBM implementation. That is, it should use the static or calculated value
> of the length, and pad to that length when unparsing. Otherwise we have
> different rules for lengthKind='explicit' depending on whether it is an
> expression or a static value.
>
> If somebody wants to avoid padding, then they can put an outputValueCalc
> on the length field and calculate a value that requires no padding. I think
> the required expression would be
> {dfdl:valueLength(../variableLengthField)}
>
> With these rules there is a potential for a mutual dependency ( deadlock )
> when outputValueCalc is used with a calculated length. If the
> outputValueCalc is
> {dfdl:contentLength(../variableLengthField)}
> then the entire representation length ( including the padding characters )
> is being requested. This obviously cannot be satisfied until the padding
> has been applied, but the padding cannot be applied until the length is
> known...etc. I don't think this should affect the decision about lengthKind
> though, for two reasons:
> - it's already possible to create deadlocks using other calculated fields
> in DFDL. Any robust implementation of a serializer must include deadlock
> detection in the evaluation of DFDL expressions.
> - it would not be difficult to detect the simple case statically and
> report a schema definition error.
>
> SMH: Corrected function names to match spec.
>
> The only downside of this is that users who want to calculate the output
> length for themselves have to jump through the non-obvious hoop of adding
> outputValueCalc to the length field and selecting the correct
> (non-deadlocking) DFDL function.
>
> But for fields have representation='text' and lengthUnits='characters'
> there is a simpler alternative: set the length expression to
> {fn:length(.)}. Other cases will be derivable from the infoset value by a
> simple calculation, so the outputValueCalc solution should only be required
> for the hard cases.
>
> SMH: Your fn:length(.) does not work unfortunately. The expression does
> not give the desired length when parsing.
>
> regards,
>
> Tim Kimber
>
>
>
>
> From:        Mike Beckerle <mbeckerle.dfdl at gmail.com>
> To:        Steve Hanson/UK/IBM at IBMGB
> Cc:        DFDL-WG <dfdl-wg at ogf.org>
> Date:        09/12/2014 18:06
> Subject:        Re: [DFDL-WG] Erratum 2.100 (was Action 183 -
> chicken-and-egg situation with lengths given by expressions)
> Sent by:        dfdl-wg-bounces at ogf.org
>  ------------------------------
>
>
>
> Idea 1:
>
> Just provide an infoset member [contentLength] which would allow one to
> get and (when unparsing) set the content length. This would provide the
> "sticky" content length memory that people are seeking when they say they
> want data to round-trip. If the same infoset object is parsed and then
> unparsed, the existence of this [contentLength] member provides the memory
> of the content length from the parse. An outputValueCalc could meaningfully
> ask for dfdl:contentLength() which would return the value of this member.
> If unset, then when unparsing the value of this member can be defined to be
> the dfdl:valueLength() of the infoset value.
>
> SMH; Not keen on this. It makes the DFDL infoset deviate too much from the
> XML infoset.
>
> Idea 2:
> Per our discussion on the call, I think we are trying to get two behaviors
> into the same length kind, and we can't have it both ways.
>
> So I am fine if we say lengthKind 'explicit' evaluates the expression both
> parsing and unparsing.
>
> I would suggest new lengthKind 'explicitParseOnly' evaluates the
> expression only when parsing. When this is used, an outputValueCalc would
> be needed to compute the length value and unparse the representation of it.
> That calculation would likely need to refer to the dfdl:valueLength() of
> the element whose length needs to be stored.
>
> SMH: Not keen on the name. Let's keep thinking!
>
> Idea 3:
> An alternative would be to introduce property dfdl:unparseContentLength
> which is an integer constant, or an expression, or one of three special
> enum values. When dfdl:unparseContentLength='useLength' then it would be
> the same value as the length (same constant, or computed by evaluating the
> same expression). Or if dfdl:unparseContentLength='useValueLength' would
> mean that the length is equivalent to writing dfdl:unparseContentLength='{
> dfdl:valueLength(., LU) }', where LU is the length units of the element
> that is the first argument). Nothing would be automatically saving out the
> explicitly computed length, so an outputValueCalc would need to be able to
> get at the dfdl:valueLength or dfdl:contentLength - which would return the
> value of the dfdl:unparseContentLength)
>
> A thought: perhaps we really only need the enum values, and not the
> ability to specify constants or expressions? In that case I'd suggest the
> property have the "policy" suffix on the name, i.e.,
> dfdl:unparseContentLengthPolicy.
>
> Other possible name choices for this property:
> dfdl:explicitLengthKindUnparsePolicy or
> dfdl:lengthKindExplicitUnparsePolicy.
>
> I guess we get our choice of whether we add new length kinds along with
> 'explicit' and we define lengthKind 'explicit' to be one of these
> behaviors, or we add this enum policy property to add additional meaning
> when the lengthKind is 'explicit'.
>
> The difference of the two is taste and style I think.
>
> SMH: I prefer a new enum for lengthKind. That's the point of the property.
> It's much easier to say in the spec '...when dfdl:lengthKind is 'prefixed'
> or 'expression'...' than '...when dfdl:lengthKind is 'prefixed', or
> 'explicit' and dfdl:length is an expression and dfdl:unparseContentLength
> is 'xxx' ...'.
>
>
> Mike Beckerle | OGF DFDL Workgroup Co-Chair | Tresys Technology |
> *www.tresys.com* <http://www.tresys.com/>
> Please note: Contributions to the DFDL Workgroup's email discussions are
> subject to the *OGF Intellectual Property Policy*
> <http://www.ogf.org/About/abt_policies.php>
>
>
> On Mon, Dec 8, 2014 at 11:53 AM, Steve Hanson <*smh at uk.ibm.com*
> <smh at uk.ibm.com>> wrote:
> This is the email thread that discussed the behaviour of lengthKind
> 'explicit' where length is an expression, when unparsing.
>
> Erratum 2.100 had been raised and stated that this is an example of a
> variable length, and behaves like lengthKind 'prefixed'. The discussion
> below was around whether this captured all the use cases and should the
> dfdl:length expression be evaluated during unparsing in order to obtain a
> specified length to which the value could be padded (or truncated) in the
> same way as a fixed length.  The conclusion was that the expression should
> not be evaluated.  This is consistent with occursCount - the expression is
> not evaluated on unparsing.
>
> (There was a public comment on this area but it was just observing that
> the spec had not been fully updated to reflect 2.100)
>
> The use cases are numbered in the discussion below 1) to 4), with examples.
> Updated to reflect function name changes.
>
> The original proposal is also below, which was not adopted.
>
> However, an issue is that IBM DFDL already implements use case 1) by
> evaluating the expression on unparsing. I discovered this when I came to
> change the code to ensure 2.100 behaviour was followed. It is therefore
> likely that there are users who are relying on this behaviour in IBM DFDL.
>
> Coincidentally the same day I was asked by a user about the following use
> case.  User has BCD data with length prefix. He wants to parse the data, do
> some processing then unparse the data, but it is not acceptable for the
> length of the data to change. Specifically, the BCD may start with 0s. When
> this is parsed as a decimal or integer, it has the effect of losing leading
> 0s so they need to be added back during unparsing.  This is not possible
> with lengthKind 'prefixed' and nor is it possible with lengthKind
> 'expression' with erratum 2.100. User is currently treating the element as
> a 'prefixed' hexBinary blob to preserve the leading 0s. While this works
> for his use case, I am worried that the blob solution won't always be
> appropriate.
>
> Regards
>
> Steve Hanson
> Architect, *IBM DFDL*
> <http://www.ibm.com/developerworks/library/se-dfdl/index.html>
> Co-Chair, *OGF DFDL Working Group* <http://www.ogf.org/dfdl/>
> IBM SWG, Hursley, UK
> *smh at uk.ibm.com* <smh at uk.ibm.com>
> tel:*+44-1962-815848* <%2B44-1962-815848>
> ----- Forwarded by Steve Hanson/UK/IBM on 08/12/2014 15:53 -----
>
> From:        Steve Hanson/UK/IBM
> To:        *dfdl-wg at ogf.org* <dfdl-wg at ogf.org>
> Date:        08/01/2013 18:06
> Subject:        Fw: [DFDL-WG] Action 183 - chicken-and-egg situation with
> lengths given        by expressions
>  ------------------------------
>
>
> Did not adopt the proposal below, and errata 2.100 remains as documented.
>
> Regards
>
> Steve Hanson
> Architect, Data Format Description Language (DFDL)
> Co-Chair, *OGF DFDL Working Group* <http://www.ogf.org/dfdl/>
> IBM SWG, Hursley, UK
> *smh at uk.ibm.com* <smh at uk.ibm.com>
> tel:*+44-1962-815848* <%2B44-1962-815848>
> ----- Forwarded by Steve Hanson/UK/IBM on 08/01/2013 18:05 -----
>
> From:        Steve Hanson/UK/IBM
> To:        *dfdl-wg at ogf.org* <dfdl-wg at ogf.org>,
> Date:        08/01/2013 14:44
> Subject:        Re: Fw: [DFDL-WG] Action 183 - chicken-and-egg situation
> with lengths given        by expressions
>  ------------------------------
>
>
> If we had the following we would have three behaviours which I think would
> cover the identified use cases.
>
> a) dfdl:lengthKind = "explicit", dfdl:length is an integer => length is
> specified on parsing & unparsing
>
> b) dfdl:lengthKind = "explicit", dfdl:length is an expression => length is
> specified on parsing & unparsing (expression must evaluate)
> - use cases 1) & 2) below
> - Mike's scenario where length is specified externally
>
> c) dfdl:lengthKind = "expression" (*new*), dfdl:length must be an
> expression => length is specified on parsing, but variable on unparsing
> (expression not evaluated)
> - new enum
> - use case 3) below
> - gives the erratum 2.100 behaviour
> - matches occursCountKind 'expression' behaviour
>
> Alternatively we leave errata 2.100 as it is and for Mike's scenario the
> application must do the padding before it passes the infoset values to the
> unparser.
>
> Regards
>
> Steve Hanson
> Architect, Data Format Description Language (DFDL)
> Co-Chair, *OGF DFDL Working Group* <http://www.ogf.org/dfdl/>
> IBM SWG, Hursley, UK
> *smh at uk.ibm.com* <smh at uk.ibm.com>
> tel:*+44-1962-815848* <%2B44-1962-815848>
>
>
>
> From:        Steve Hanson/UK/IBM
> To:        *dfdl-wg at ogf.org* <dfdl-wg at ogf.org>,
> Date:        11/12/2012 10:36
> Subject:        Fw: [DFDL-WG] Action 183 - chicken-and-egg situation with
> lengths given        by expressions
>  ------------------------------
>
>
> The email thread which provided the material for the discussion which led
> to:
>
> * 2.100**. Section 12.3.1. State that when unparsing an element with
> lengthKind ‘explicit’ and where length is an expression, then the data in
> the Infoset is treated as variable length and not fixed length. The
> behaviour is the same as lengthKind ‘prefixed’.  *
>
> Regards
>
> Steve Hanson
> Architect, Data Format Description Language (DFDL)
> Co-Chair, *OGF DFDL Working Group* <http://www.ogf.org/dfdl/>
> IBM SWG, Hursley, UK
> *smh at uk.ibm.com* <smh at uk.ibm.com>
> tel:*+44-1962-815848* <%2B44-1962-815848>
> ----- Forwarded by Steve Hanson/UK/IBM on 16/11/2012 16:04 -----
>
> From:        Tim Kimber/UK/IBM at IBMGB
> To:        *dfdl-wg at ogf.org* <dfdl-wg at ogf.org>,
> Date:        11/09/2012 12:39
> Subject:        Re: [DFDL-WG] Action 183 - chicken-and-egg situation with
> lengths given        by expressions
> Sent by:        *dfdl-wg-bounces at ogf.org* <dfdl-wg-bounces at ogf.org>
>  ------------------------------
>
>
>
> Good point. The problem is that lengthKind-'explicit' is being used for
> two things:
> a) a length that is static
> b) a length that is calculated
> ...so the DFDL serializer must assume that the expression needs to be
> evaluated.
>
> For occursCountKind we have separate values for 'fixed' and 'expression'.
> If we did not, then occursCountKind would have the same problem except that
> it would affect defaulting rather than padding.
>
> regards,
>
> Tim Kimber, DFDL Team,
> Hursley, UK
> Internet:  *kimbert at uk.ibm.com* <kimbert at uk.ibm.com>
> Tel. 01962-816742
> Internal tel. 37246742
>
>
>
>
> From:        Steve Hanson/UK/IBM at IBMGB
> To:        *dfdl-wg at ogf.org* <dfdl-wg at ogf.org>,
> Date:        11/09/2012 12:27
> Subject:        [DFDL-WG] Action 183 - chicken-and-egg situation with
> lengths given        by expressions
> Sent by:        *dfdl-wg-bounces at ogf.org* <dfdl-wg-bounces at ogf.org>
>  ------------------------------
>
>
>
> This mail is on the expected behaviour of the DFDL unparser when writing
> out a 'data' element the length of which is held in an earlier 'len'
> element.
>
> There are several scenarios, some straightforward and some that exhibit a
> chicken-and-egg behaviour.  The principle of what happens is understood,
> the action is to make sure that the behaviour is explained in enough detail
> in the spec to enable implementations to be consistent. (Note - IBM DFDL
> does not yet support outputVaueCalc so has not hit this yet).
>
> Scenarios follow. The 'data' element shown is simple, but the same
> principles apply if it is complex.
>
> 1) 'len' is set from infoset
>
> - 'len' can be set in augmented infoset
> - No issue as 'data's length expression may be evaluated
>
> <xsd:element name="message1">
>  <xsd:complexType>
>    <xsd:sequence>
>      <xsd:element name="len" type="xsd:int"
>                   dfdl:lengthKind="explicit" dfdl:length="2" />
>      <xsd:element name="data" type="xsd:string"
>                   dfdl:length="{/message1/len}" dfdl:lengthKind="explicit"
> />
>    </xsd:sequence>
>  </xsd:complexType>
> </xsd:element>
>
> 2) 'len' is set using outputValueCalc with fixed expression
>
> - When 'len's outputValueCalc is encountered, it can be evaluated then and
> there
> - 'len' can be set in augmented infoset
> - No issue as 'data's length expression may be evaluated
>
> <xsd:element name="message1">
>  <xsd:complexType>
>    <xsd:sequence>
>      <xsd:element name="len" type="xsd:int"
>                   dfdl:outputValueCalc="{10}" dfdl:lengthKind="explicit"
> dfdl:length="2" />
>      <xsd:element name="data" type="xsd:string"
>                   dfdl:length="{/message1/len}" dfdl:lengthKind="explicit"
> />
>    </xsd:sequence>
>  </xsd:complexType>
> </xsd:element>
>
> 3) 'len' is set using outputValueCalc with reference 'data' (unpadded)
>
> - When 'len's outputValueCalc is encountered, it can not yet be evaluated
> as it depends on the length of 'data'
> - 'len' can not yet be set in augmented infoset
> - Problem as 'data's length expression can not be evaluated
> - But we do know the unpadded length of 'data' so 'len's outputValueCalc
> can now be evaluated
> - In turn this means that  'data's length expression can now be evaluated
>
> <xsd:element name="message1">
>  <xsd:complexType>
>    <xsd:sequence>
>      <xsd:element name="len" type="xsd:int"
>
> dfdl:outputValueCalc="{dfdl:valueLength(/message1/data)}"
> dfdl:lengthKind="explicit" dfdl:length="2" />
>      <xsd:element name="data" type="xsd:string"
>                   dfdl:length="{/message1/len}" dfdl:lengthKind="explicit"
> />
>    </xsd:sequence>
>  </xsd:complexType>
> </xsd:element>
>
> 4) 'len' is set using outputValueCalc with reference 'data' (padded)
>
> - When 'len's outputValueCalc is encountered, it can not yet be evaluated
> as it depends on the length of 'data'
> - 'len' can not yet be set in augmented infoset
> - Problem as 'data's length expression can not be evaluated
> - We don't know the padded length of 'data' because we don't know 'len'
> - *Problem*: 'data's length expression can never be evaluated
>
> <xsd:element name="message1">
>  <xsd:complexType>
>    <xsd:sequence>
>      <xsd:element name="len" type="xsd:int"
>
> dfdl:outputValueCalc="{dfdl:contentLength(/message1/data)}"
> dfdl:lengthKind="explicit" dfdl:length="2" />
>      <xsd:element name="data" type="xsd:string"
>                   dfdl:length="{/message1/len}" dfdl:lengthKind="explicit"
> />
>    </xsd:sequence>
>  </xsd:complexType>
> </xsd:element>
>
>
>
> Regards
>
> Steve Hanson
> Architect, Data Format Description Language (DFDL)
> Co-Chair, *OGF DFDL Working Group* <http://www.ogf.org/dfdl/>
> IBM SWG, Hursley, UK
> *smh at uk.ibm.com* <smh at uk.ibm.com>
> tel:*+44-1962-815848* <%2B44-1962-815848>
>
>
>
> From:        Steve Hanson/UK/IBM
> To:        *dfdl-wg at ogf.org* <dfdl-wg at ogf.org>
> Date:        04/09/2012 17:31
> Subject:        Fw: Behaviour for lengthKind 'endOfparent' is still not
> fully specified
>  ------------------------------
>
>
> DFDL WG call 4th Sept 2012:
>
> 1) Agreed that for binary data, only xs:hexBinary and packed/BCD allowed
> to have endOfParent
>
> 2) Agreed this is the correct behaviour when filling to a known length
>
> 3) Agreed this is the correct behaviour when filling to a known length
>
> 4) Agreed this is the correct behaviour when filling to a known length
>
> It was noted that lengthKind 'explicit' on the parent may not result in a
> known length if the length is an expression. This is an example of a more
> general chicken-and-egg situation with lengths given by expressions, for
> which outputValueCalc and DFDL functions unpaddedLength() were added can be
> used. Action raised to ensure that the behaviour of an implementation is
> fully defined by the spec.
>
> Regards
>
> Steve Hanson
> Architect, Data Format Description Language (DFDL)
> Co-Chair, *OGF DFDL Working Group* <http://www.ogf.org/dfdl/>
> IBM SWG, Hursley, UK
> *smh at uk.ibm.com* <smh at uk.ibm.com>
> tel:*+44-1962-815848* <%2B44-1962-815848>
> ----- Forwarded by Steve Hanson/UK/IBM on 04/09/2012 17:24 -----
>
> From:        Steve Hanson/UK/IBM
> To:        *dfdl-wg at ogf.org* <dfdl-wg at ogf.org> <*dfdl-wg at ogf.org*
> <dfdl-wg at ogf.org>>
> Date:        04/09/2012 14:08
> Subject:        Behaviour for lengthKind 'endOfparent' is still not fully
> specified
>  ------------------------------
>
>
> Noted when I reviewed latest spec - endOfParent and unparsing is not fully
> thought through.
>
> The spec today says that I can use endOfParent with binary data. There is
> a restriction in section 12.3.8, but it only applies when an element is
> endOfParent *and* its parent is lengthKind delimited.
>
> There are a couple of cases to consider:
>
> 1) Binary data of restricted length (see list in other email "proposed
> clarification/narrowing - delimited binary data should decimal"). I don't
> think it makes sense to allow these. We don't allow these binary reps for
> delimited.
>
> 2) Text data of variable length when unparsing. Box scenario. If the data
> in the infoset is shorter than the space in the box, what we do?  I think
> we should pad to box length with appropriate padChar, according to
> justification, as that is effectively a 'specified length'. Error if
> textPadKind is 'none'. Use parent's lengthUnits.
>
> 3) HexBinary data of variable length when unparsing. Box scenario. If the
> data in the infoset is shorter than the space in the box, what we do?  I
> think we should right-pad to box length with fill byte, as that is
> effectively a 'specified length'.
>
> 4) Packed/BCD binary data of variable length when unparsing. Box scenario.
> If the data in the infoset is shorter than the space in the box, what we
> do?  I think we should pad to box length with zero bytes, according to
> justification, as that is effectively a 'specified length'.  (Must be zero
> bytes and not fill byte as must be numeric in order to be parsed).
>
> In relation to 2 - 4, note that lengthKind 'endOfParent' can only be used
> with a parent lengthKind of 'explicit', 'pattern', 'prefixed' or
> 'endOfParent' or a choice with choiceLengthKind 'explicit', so the box
> scenario when unparsing therefore occurs only when lengthKind is 'explicit'
> or choiceLengthKind is 'explicit' - these are the cases when the length is
> known.  Also note that when there are nested 'endOfParent' elements (which
> is allowed) then all padding must be done on the simple element (ie, the
> innermost element), to ensure that what is output can be parsed.
>
> Regards
>
> Steve Hanson
> Architect, Data Format Description Language (DFDL)
> Co-Chair, *OGF DFDL Working Group* <http://www.ogf.org/dfdl/>
> IBM SWG, Hursley, UK
> *smh at uk.ibm.com* <smh at uk.ibm.com>
> tel:*+44-1962-815848* <%2B44-1962-815848>
>
> Unless stated otherwise above:
> IBM United Kingdom Limited - Registered in England and Wales with number
> 741598.
> Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU
>
>
> Unless stated otherwise above:
> IBM United Kingdom Limited - Registered in England and Wales with number
> 741598.
> Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU
> --
> dfdl-wg mailing list
> *dfdl-wg at ogf.org* <dfdl-wg at ogf.org>
> *https://www.ogf.org/mailman/listinfo/dfdl-wg*
> <https://www.ogf.org/mailman/listinfo/dfdl-wg>
>
> Unless stated otherwise above:
> IBM United Kingdom Limited - Registered in England and Wales with number
> 741598.
> Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU
> --
> dfdl-wg mailing list
> *dfdl-wg at ogf.org* <dfdl-wg at ogf.org>
> *https://www.ogf.org/mailman/listinfo/dfdl-wg*
> <https://www.ogf.org/mailman/listinfo/dfdl-wg>
> Unless stated otherwise above:
> IBM United Kingdom Limited - Registered in England and Wales with number
> 741598.
> Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU
>
>
> Unless stated otherwise above:
> IBM United Kingdom Limited - Registered in England and Wales with number
> 741598.
> Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU
>
> Unless stated otherwise above:
> IBM United Kingdom Limited - Registered in England and Wales with number
> 741598.
> Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU
>
> Unless stated otherwise above:
> IBM United Kingdom Limited - Registered in England and Wales with number
> 741598.
> Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU
>
> --
>  dfdl-wg mailing list
>  *dfdl-wg at ogf.org* <dfdl-wg at ogf.org>
>  *https://www.ogf.org/mailman/listinfo/dfdl-wg*
> <https://www.ogf.org/mailman/listinfo/dfdl-wg>
> --
> dfdl-wg mailing list
> dfdl-wg at ogf.org
> *https://www.ogf.org/mailman/listinfo/dfdl-wg*
> <https://www.ogf.org/mailman/listinfo/dfdl-wg>
>
> Unless stated otherwise above:
> IBM United Kingdom Limited - Registered in England and Wales with number
> 741598.
> Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU
> --
>  dfdl-wg mailing list
>  dfdl-wg at ogf.org
>  https://www.ogf.org/mailman/listinfo/dfdl-wg
>
> Unless stated otherwise above:
> IBM United Kingdom Limited - Registered in England and Wales with number
> 741598.
> Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU
>
> Unless stated otherwise above:
> IBM United Kingdom Limited - Registered in England and Wales with number
> 741598.
> Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU
>
> Unless stated otherwise above:
> IBM United Kingdom Limited - Registered in England and Wales with number
> 741598.
> Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU
>
> Unless stated otherwise above:
> IBM United Kingdom Limited - Registered in England and Wales with number
> 741598.
> Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU
>
> --
>   dfdl-wg mailing list
>   dfdl-wg at ogf.org
>   https://www.ogf.org/mailman/listinfo/dfdl-wg
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.ogf.org/pipermail/dfdl-wg/attachments/20150129/fc1c0ecf/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: gwdrp-dfdl-v1.0.4-action276(1).docx
Type: application/vnd.openxmlformats-officedocument.wordprocessingml.document
Size: 1256212 bytes
Desc: not available
URL: <http://www.ogf.org/pipermail/dfdl-wg/attachments/20150129/fc1c0ecf/attachment-0001.docx>


More information about the dfdl-wg mailing list