[DFDL-WG] Action 276: Erratum 2.100 (was Action 183 - chicken-and-egg situation with lengths given by expressions)

Thu Feb 5 11:11:33 EST 2015

RE your points 8 and 9.

dfdl:lengthKind explicit really isn't the same as prefixed in general. It
appears in the spec that you have revised/deleted the older misleading
language that suggests they work the same way, and all mentions of padding
to minLength/textOutputMinLength are qualified by the other lengthKinds. So
point 9 is handled.

In the case of prefixed, the length is padded to the minimum, and that is
what is stored. In the case of 'explicit', an expression gets evaluated.
That expression can round up to a multiple of 32 bytes, or have a scale
factor so that it multiplies the value being stored by 8, or all sorts of
things. There might even be multiple fields holding the length. One with a
scale factor, and the other with the number of such scale units, so you
have to multiply them to get the length.

The only downside is if a schema with dfdl:lengthKind explicit wants to use
the minimum length in its computation of the explicit length. If we do
nothing, then a schema author will have to repeat the constant they have in
the minLength facet or textOutputMinLength property within the dfdl:length
expression. To me this is minor. In the future I suspect it will be helpful
to have a small library of functions that allow one to access the values of
facets, the min/maxOccurs, and perhaps the values of some dfdl properties,
but for now I don't think we need to add this.

I looked at the comments on the spec you sent. I'm ok with it in this form.
You pointed out that "Infoset element" is ambiguous in the paragraph about
how valueLength is to be used, so I'm ok going back to "original element"
vs. "length element".

Mike Beckerle | OGF DFDL Workgroup Co-Chair | Tresys Technology |
www.tresys.com
Please note: Contributions to the DFDL Workgroup's email discussions are
subject to the OGF Intellectual Property Policy
<http://www.ogf.org/About/abt_policies.php>

On Tue, Feb 3, 2015 at 9:13 AM, Steve Hanson <smh at uk.ibm.com> wrote:

> Mike
>
> Thanks for your good suggestions. I have added a further comment under
> each one. If I have missed any of your comments please let me know.
>
> Do you have an opinion on the last bullet in my list below (9)?  Do we
> need a function that returns minimum length?
>
>
>
> Regards
>
> Steve Hanson
> Architect, *IBM DFDL*
> <http://www.ibm.com/developerworks/library/se-dfdl/index.html>
> Co-Chair, *OGF DFDL Working Group* <http://www.ogf.org/dfdl/>
> IBM SWG, Hursley, UK
> *smh at uk.ibm.com* <smh at uk.ibm.com>
> tel:+44-1962-815848
>
>
>
> From:        Mike Beckerle <mbeckerle.dfdl at gmail.com>
> To:        Steve Hanson/UK/IBM at IBMGB
> Cc:        DFDL-WG <dfdl-wg at ogf.org>
> Date:        29/01/2015 21:32
> Subject:        Re: [DFDL-WG] Action 276: Erratum 2.100 (was Action 183 -
> chicken-and-egg situation with lengths given by expressions)
> ------------------------------
>
>
>
> In general I support the notion here that we evaluate the expressions when
> unparsing.
>
> I added a couple small wording changes and suggestions. Search for my
> comment bubbles.
>
> Mike Beckerle | OGF DFDL Workgroup Co-Chair | Tresys Technology |
> *www.tresys.com* <http://www.tresys.com/>
> Please note: Contributions to the DFDL Workgroup's email discussions are
> subject to the *OGF Intellectual Property Policy*
> <http://www.ogf.org/About/abt_policies.php>
>
>
> On Mon, Jan 19, 2015 at 1:06 PM, Steve Hanson <*smh at uk.ibm.com*
> <smh at uk.ibm.com>> wrote:
> Please have a position on the below proposal from IBM for this week's WG
> call.
>
> Regards
>
> Steve Hanson
> Architect, *IBM DFDL*
> <http://www.ibm.com/developerworks/library/se-dfdl/index.html>
> Co-Chair, *OGF DFDL Working Group* <http://www.ogf.org/dfdl/>
> IBM SWG, Hursley, UK
> *smh at uk.ibm.com* <smh at uk.ibm.com>
> tel:*+44-1962-815848* <%2B44-1962-815848>
>
>
>
> From:        Steve Hanson/UK/IBM
> To:        DFDL-WG <*dfdl-wg at ogf.org* <dfdl-wg at ogf.org>>
> Date:        12/01/2015 18:14
> Subject:        Fw: [DFDL-WG] Erratum 2.100 (was Action 183 -
> chicken-and-egg situation with lengths given by expressions)
>  ------------------------------
>
>
> After internal discussion, IBM proposes the following:
>
>    1. No new dfdl:lengthKind enums
>    2. dfdl:lengthKind 'explicit' is always considered to be 'fixed length'
>    3. When dfdl:length is an expression, it is always evaluated when
>    unparsing
>    4. If the intent is for the infoset data of the element to provide the
>    length, the integer element that is the target of the expression should use
>    dfdl:outputValueCalc with function dfdl:valueLength() referring forward to
>    the infoset element. (Use of dfdl:contentLength() is a schema definition
>    error as it can cause deadlock).
>    5. It should be noted that dfdl:valueLength() returns an unpadded
>    length. This does *not* take into account the XSD minLength or
>    dfdl:textOutputMinLength properties.
>    6. The net is that for simple text elements with dfdl:textPadKind
>    'padChar', 'delimited', 'pattern', 'prefixed' are considered variable
>    length when unparsing and are padded to a minimum length, and 'implicit',
>    'explicit', 'endOfParent' have a target length and are padded to that
>    length.
>    7. Similar for xs:hexBinary elements, 'delimited', 'pattern',
>    'prefixed' are always padded to a minimum length, and 'implicit',
>    'explicit', 'endOfParent' are always padded to target length.
>    8. So for a text element with dfdl:textPadKind 'padChar', or a
>    xs:hexBinary element, and its unpaddedLength is less than the minimum,
>    'explicit' (expression) is not identical in behaviour to 'prefixed'.
>    9. If the last point is thought to be a problem, options are to allow
>    dfdl:valueLength() to return minimum length or actual length depending on a
>    new argument, or create a new function to return minimum length. This can
>    be done under existing deferred action 242.
>
> To verify that these changes fit into the spec ok, I have updated it -
> change tracking on and comment for each change.
>
>
>
> Regards
>
> Steve Hanson
> Architect, *IBM DFDL*
> <http://www.ibm.com/developerworks/library/se-dfdl/index.html>
> Co-Chair, *OGF DFDL Working Group* <http://www.ogf.org/dfdl/>
> IBM SWG, Hursley, UK
> *smh at uk.ibm.com* <smh at uk.ibm.com>
> tel:*+44-1962-815848* <%2B44-1962-815848>
> ----- Forwarded by Steve Hanson/UK/IBM on 12/01/2015 10:20 -----
>
> From:        Steve Hanson/UK/IBM
> To:        Tim Kimber/UK/IBM at IBMGB
> Cc:        "Mike Beckerle" <*mbeckerle.dfdl at gmail.com*
> <mbeckerle.dfdl at gmail.com>>, DFDL-WG <*dfdl-wg at ogf.org* <dfdl-wg at ogf.org>>
> Date:        06/01/2015 15:54
> Subject:        Re: [DFDL-WG] Erratum 2.100 (was Action 183 -
> chicken-and-egg situation with lengths given by expressions)
>  ------------------------------
>
>
> A thought.
>
> We could add dfdl:lengthKind 'expression', use dfdl:length for the
> expression, and add a new enum property that is the policy for the
> expression - that is, evaluate on parse only, or evaluate on both parse and
> unparse.  The enum for the latter is the equivalent of the pre-erratum
> 2.100 'explicit' and expression behaviour as implemented by IBM DFDL today
> - which we would then deprecate in the spec (or even remove) and deprecate
> in IBM DFDL.
>
> Regards
>
> Steve Hanson
> Architect, *IBM DFDL*
> <http://www.ibm.com/developerworks/library/se-dfdl/index.html>
> Co-Chair, *OGF DFDL Working Group* <http://www.ogf.org/dfdl/>
> IBM SWG, Hursley, UK
> *smh at uk.ibm.com* <smh at uk.ibm.com>
> tel:*+44-1962-815848* <%2B44-1962-815848>
>
>
>
>
> From:        Steve Hanson/UK/IBM
> To:        Tim Kimber/UK/IBM at IBMGB, "Mike Beckerle" <
> *mbeckerle.dfdl at gmail.com* <mbeckerle.dfdl at gmail.com>>
> Cc:        DFDL-WG <*dfdl-wg at ogf.org* <dfdl-wg at ogf.org>>
> Date:        10/12/2014 09:08
> Subject:        Re: [DFDL-WG] Erratum 2.100 (was Action 183 -
> chicken-and-egg situation with lengths given by expressions)
>  ------------------------------
>
>
> Tim & Mike - replies in-line.
>
> Concensus is heading towards 'explicit' working as per IBM DFDL today, so
> we need to decide on the best way to express the 2.100 behaviour.
>
> Regards
>
> Steve Hanson
> Architect, *IBM DFDL*
> <http://www.ibm.com/developerworks/library/se-dfdl/index.html>
> Co-Chair, *OGF DFDL Working Group* <http://www.ogf.org/dfdl/>
> IBM SWG, Hursley, UK
> *smh at uk.ibm.com* <smh at uk.ibm.com>
> tel:*+44-1962-815848* <%2B44-1962-815848>
>
>
>
>
> From:        Tim Kimber/UK/IBM at IBMGB
> To:        *dfdl-wg at ogf.org* <dfdl-wg at ogf.org>
> Date:        09/12/2014 23:25
> Subject:        Re: [DFDL-WG] Erratum 2.100 (was Action 183 -
> chicken-and-egg situation with lengths given by expressions)
> Sent by:        *dfdl-wg-bounces at ogf.org* <dfdl-wg-bounces at ogf.org>
>  ------------------------------
>
>
>
> My 2 cents:
>
> lengthKind 'explicit' should continue to work as it currently does in the
> IBM implementation. That is, it should use the static or calculated value
> of the length, and pad to that length when unparsing. Otherwise we have
> different rules for lengthKind='explicit' depending on whether it is an
> expression or a static value.
>
> If somebody wants to avoid padding, then they can put an outputValueCalc
> on the length field and calculate a value that requires no padding. I think
> the required expression would be
> {dfdl:valueLength(../variableLengthField)}
>
> With these rules there is a potential for a mutual dependency ( deadlock )
> when outputValueCalc is used with a calculated length. If the
> outputValueCalc is
> {dfdl:contentLength(../variableLengthField)}
> then the entire representation length ( including the padding characters )
> is being requested. This obviously cannot be satisfied until the padding
> has been applied, but the padding cannot be applied until the length is
> known...etc. I don't think this should affect the decision about lengthKind
> though, for two reasons:
> - it's already possible to create deadlocks using other calculated fields
> in DFDL. Any robust implementation of a serializer must include deadlock
> detection in the evaluation of DFDL expressions.
> - it would not be difficult to detect the simple case statically and
> report a schema definition error.
>
> SMH: Corrected function names to match spec.
>
> The only downside of this is that users who want to calculate the output
> length for themselves have to jump through the non-obvious hoop of adding
> outputValueCalc to the length field and selecting the correct
> (non-deadlocking) DFDL function.
>
> But for fields have representation='text' and lengthUnits='characters'
> there is a simpler alternative: set the length expression to
> {fn:length(.)}. Other cases will be derivable from the infoset value by a
> simple calculation, so the outputValueCalc solution should only be required
> for the hard cases.
>
> SMH: Your fn:length(.) does not work unfortunately. The expression does
> not give the desired length when parsing.
>
> regards,
>
> Tim Kimber
>
>
>
>
> From:        Mike Beckerle <*mbeckerle.dfdl at gmail.com*
> <mbeckerle.dfdl at gmail.com>>
> To:        Steve Hanson/UK/IBM at IBMGB
> Cc:        DFDL-WG <*dfdl-wg at ogf.org* <dfdl-wg at ogf.org>>
> Date:        09/12/2014 18:06
> Subject:        Re: [DFDL-WG] Erratum 2.100 (was Action 183 -
> chicken-and-egg situation with lengths given by expressions)
> Sent by:        *dfdl-wg-bounces at ogf.org* <dfdl-wg-bounces at ogf.org>
>  ------------------------------
>
>
>
> Idea 1:
>
> Just provide an infoset member [contentLength] which would allow one to
> get and (when unparsing) set the content length. This would provide the
> "sticky" content length memory that people are seeking when they say they
> want data to round-trip. If the same infoset object is parsed and then
> unparsed, the existence of this [contentLength] member provides the memory
> of the content length from the parse. An outputValueCalc could meaningfully
> ask for dfdl:contentLength() which would return the value of this member.
> If unset, then when unparsing the value of this member can be defined to be
> the dfdl:valueLength() of the infoset value.
>
> SMH; Not keen on this. It makes the DFDL infoset deviate too much from the
> XML infoset.
>
> Idea 2:
> Per our discussion on the call, I think we are trying to get two behaviors
> into the same length kind, and we can't have it both ways.
>
> So I am fine if we say lengthKind 'explicit' evaluates the expression both
> parsing and unparsing.
>
> I would suggest new lengthKind 'explicitParseOnly' evaluates the
> expression only when parsing. When this is used, an outputValueCalc would
> be needed to compute the length value and unparse the representation of it.
> That calculation would likely need to refer to the dfdl:valueLength() of
> the element whose length needs to be stored.
>
> SMH: Not keen on the name. Let's keep thinking!
>
> Idea 3:
> An alternative would be to introduce property dfdl:unparseContentLength
> which is an integer constant, or an expression, or one of three special
> enum values. When dfdl:unparseContentLength='useLength' then it would be
> the same value as the length (same constant, or computed by evaluating the
> same expression). Or if dfdl:unparseContentLength='useValueLength' would
> mean that the length is equivalent to writing dfdl:unparseContentLength='{
> dfdl:valueLength(., LU) }', where LU is the length units of the element
> that is the first argument). Nothing would be automatically saving out the
> explicitly computed length, so an outputValueCalc would need to be able to
> get at the dfdl:valueLength or dfdl:contentLength - which would return the
> value of the dfdl:unparseContentLength)
>
> A thought: perhaps we really only need the enum values, and not the
> ability to specify constants or expressions? In that case I'd suggest the
> property have the "policy" suffix on the name, i.e.,
> dfdl:unparseContentLengthPolicy.
>
> Other possible name choices for this property:
> dfdl:explicitLengthKindUnparsePolicy or
> dfdl:lengthKindExplicitUnparsePolicy.
>
> I guess we get our choice of whether we add new length kinds along with
> 'explicit' and we define lengthKind 'explicit' to be one of these
> behaviors, or we add this enum policy property to add additional meaning
> when the lengthKind is 'explicit'.
>
> The difference of the two is taste and style I think.
>
> SMH: I prefer a new enum for lengthKind. That's the point of the property.
> It's much easier to say in the spec '...when dfdl:lengthKind is 'prefixed'
> or 'expression'...' than '...when dfdl:lengthKind is 'prefixed', or
> 'explicit' and dfdl:length is an expression and dfdl:unparseContentLength
> is 'xxx' ...'.
>
>
> Mike Beckerle | OGF DFDL Workgroup Co-Chair | Tresys Technology |
> *www.tresys.com* <http://www.tresys.com/>
> Please note: Contributions to the DFDL Workgroup's email discussions are
> subject to the *OGF Intellectual Property Policy*
> <http://www.ogf.org/About/abt_policies.php>
>
>
> On Mon, Dec 8, 2014 at 11:53 AM, Steve Hanson <*smh at uk.ibm.com*
> <smh at uk.ibm.com>> wrote:
> This is the email thread that discussed the behaviour of lengthKind
> 'explicit' where length is an expression, when unparsing.
>
> Erratum 2.100 had been raised and stated that this is an example of a
> variable length, and behaves like lengthKind 'prefixed'. The discussion
> below was around whether this captured all the use cases and should the
> dfdl:length expression be evaluated during unparsing in order to obtain a
> specified length to which the value could be padded (or truncated) in the
> same way as a fixed length.  The conclusion was that the expression should
> not be evaluated.  This is consistent with occursCount - the expression is
> not evaluated on unparsing.
>
> (There was a public comment on this area but it was just observing that
> the spec had not been fully updated to reflect 2.100)
>
> The use cases are numbered in the discussion below 1) to 4), with examples.
> Updated to reflect function name changes.
>
> The original proposal is also below, which was not adopted.
>
> However, an issue is that IBM DFDL already implements use case 1) by
> evaluating the expression on unparsing. I discovered this when I came to
> change the code to ensure 2.100 behaviour was followed. It is therefore
> likely that there are users who are relying on this behaviour in IBM DFDL.
>
> Coincidentally the same day I was asked by a user about the following use
> case.  User has BCD data with length prefix. He wants to parse the data, do
> some processing then unparse the data, but it is not acceptable for the
> length of the data to change. Specifically, the BCD may start with 0s. When
> this is parsed as a decimal or integer, it has the effect of losing leading
> 0s so they need to be added back during unparsing.  This is not possible
> with lengthKind 'prefixed' and nor is it possible with lengthKind
> 'expression' with erratum 2.100. User is currently treating the element as
> a 'prefixed' hexBinary blob to preserve the leading 0s. While this works
> for his use case, I am worried that the blob solution won't always be
> appropriate.
>
> Regards
>
> Steve Hanson
> Architect, *IBM DFDL*
> <http://www.ibm.com/developerworks/library/se-dfdl/index.html>
> Co-Chair, *OGF DFDL Working Group* <http://www.ogf.org/dfdl/>
> IBM SWG, Hursley, UK
> *smh at uk.ibm.com* <smh at uk.ibm.com>
> tel:*+44-1962-815848* <%2B44-1962-815848>
> ----- Forwarded by Steve Hanson/UK/IBM on 08/12/2014 15:53 -----
>
> From:        Steve Hanson/UK/IBM
> To:        *dfdl-wg at ogf.org* <dfdl-wg at ogf.org>
> Date:        08/01/2013 18:06
> Subject:        Fw: [DFDL-WG] Action 183 - chicken-and-egg situation with
> lengths given        by expressions
>  ------------------------------
>
>
> Did not adopt the proposal below, and errata 2.100 remains as documented.
>
> Regards
>
> Steve Hanson
> Architect, Data Format Description Language (DFDL)
> Co-Chair, *OGF DFDL Working Group* <http://www.ogf.org/dfdl/>
> IBM SWG, Hursley, UK
> *smh at uk.ibm.com* <smh at uk.ibm.com>
> tel:*+44-1962-815848* <%2B44-1962-815848>
> ----- Forwarded by Steve Hanson/UK/IBM on 08/01/2013 18:05 -----
>
> From:        Steve Hanson/UK/IBM
> To:        *dfdl-wg at ogf.org* <dfdl-wg at ogf.org>,
> Date:        08/01/2013 14:44
> Subject:        Re: Fw: [DFDL-WG] Action 183 - chicken-and-egg situation
> with lengths given        by expressions
>  ------------------------------
>
>
> If we had the following we would have three behaviours which I think would
> cover the identified use cases.
>
> a) dfdl:lengthKind = "explicit", dfdl:length is an integer => length is
> specified on parsing & unparsing
>
> b) dfdl:lengthKind = "explicit", dfdl:length is an expression => length is
> specified on parsing & unparsing (expression must evaluate)
> - use cases 1) & 2) below
> - Mike's scenario where length is specified externally
>
> c) dfdl:lengthKind = "expression" (*new*), dfdl:length must be an
> expression => length is specified on parsing, but variable on unparsing
> (expression not evaluated)
> - new enum
> - use case 3) below
> - gives the erratum 2.100 behaviour
> - matches occursCountKind 'expression' behaviour
>
> Alternatively we leave errata 2.100 as it is and for Mike's scenario the
> application must do the padding before it passes the infoset values to the
> unparser.
>
> Regards
>
> Steve Hanson
> Architect, Data Format Description Language (DFDL)
> Co-Chair, *OGF DFDL Working Group* <http://www.ogf.org/dfdl/>
> IBM SWG, Hursley, UK
> *smh at uk.ibm.com* <smh at uk.ibm.com>
> tel:*+44-1962-815848* <%2B44-1962-815848>
>
>
>
> From:        Steve Hanson/UK/IBM
> To:        *dfdl-wg at ogf.org* <dfdl-wg at ogf.org>,
> Date:        11/12/2012 10:36
> Subject:        Fw: [DFDL-WG] Action 183 - chicken-and-egg situation with
> lengths given        by expressions
>  ------------------------------
>
>
> The email thread which provided the material for the discussion which led
> to:
>
> * 2.100**. Section 12.3.1. State that when unparsing an element with
> lengthKind ‘explicit’ and where length is an expression, then the data in
> the Infoset is treated as variable length and not fixed length. The
> behaviour is the same as lengthKind ‘prefixed’.  *
>
> Regards
>
> Steve Hanson
> Architect, Data Format Description Language (DFDL)
> Co-Chair, *OGF DFDL Working Group* <http://www.ogf.org/dfdl/>
> IBM SWG, Hursley, UK
> *smh at uk.ibm.com* <smh at uk.ibm.com>
> tel:*+44-1962-815848* <%2B44-1962-815848>
> ----- Forwarded by Steve Hanson/UK/IBM on 16/11/2012 16:04 -----
>
> From:        Tim Kimber/UK/IBM at IBMGB
> To:        *dfdl-wg at ogf.org* <dfdl-wg at ogf.org>,
> Date:        11/09/2012 12:39
> Subject:        Re: [DFDL-WG] Action 183 - chicken-and-egg situation with
> lengths given        by expressions
> Sent by:        *dfdl-wg-bounces at ogf.org* <dfdl-wg-bounces at ogf.org>
>  ------------------------------
>
>
>
> Good point. The problem is that lengthKind-'explicit' is being used for
> two things:
> a) a length that is static
> b) a length that is calculated
> ...so the DFDL serializer must assume that the expression needs to be
> evaluated.
>
> For occursCountKind we have separate values for 'fixed' and 'expression'.
> If we did not, then occursCountKind would have the same problem except that
> it would affect defaulting rather than padding.
>
> regards,
>
> Tim Kimber, DFDL Team,
> Hursley, UK
> Internet:  *kimbert at uk.ibm.com* <kimbert at uk.ibm.com>
> Tel. 01962-816742
> Internal tel. 37246742
>
>
>
>
> From:        Steve Hanson/UK/IBM at IBMGB
> To:        *dfdl-wg at ogf.org* <dfdl-wg at ogf.org>,
> Date:        11/09/2012 12:27
> Subject:        [DFDL-WG] Action 183 - chicken-and-egg situation with
> lengths given        by expressions
> Sent by:        *dfdl-wg-bounces at ogf.org* <dfdl-wg-bounces at ogf.org>
>  ------------------------------
>
>
>
> This mail is on the expected behaviour of the DFDL unparser when writing
> out a 'data' element the length of which is held in an earlier 'len'
> element.
>
> There are several scenarios, some straightforward and some that exhibit a
> chicken-and-egg behaviour.  The principle of what happens is understood,
> the action is to make sure that the behaviour is explained in enough detail
> in the spec to enable implementations to be consistent. (Note - IBM DFDL
> does not yet support outputVaueCalc so has not hit this yet).
>
> Scenarios follow. The 'data' element shown is simple, but the same
> principles apply if it is complex.
>
> 1) 'len' is set from infoset
>
> - 'len' can be set in augmented infoset
> - No issue as 'data's length expression may be evaluated
>
> <xsd:element name="message1">
>  <xsd:complexType>
>    <xsd:sequence>
>      <xsd:element name="len" type="xsd:int"
>                   dfdl:lengthKind="explicit" dfdl:length="2" />
>      <xsd:element name="data" type="xsd:string"
>                   dfdl:length="{/message1/len}" dfdl:lengthKind="explicit"
> />
>    </xsd:sequence>
>  </xsd:complexType>
> </xsd:element>
>
> 2) 'len' is set using outputValueCalc with fixed expression
>
> - When 'len's outputValueCalc is encountered, it can be evaluated then and
> there
> - 'len' can be set in augmented infoset
> - No issue as 'data's length expression may be evaluated
>
> <xsd:element name="message1">
>  <xsd:complexType>
>    <xsd:sequence>
>      <xsd:element name="len" type="xsd:int"
>                   dfdl:outputValueCalc="{10}" dfdl:lengthKind="explicit"
> dfdl:length="2" />
>      <xsd:element name="data" type="xsd:string"
>                   dfdl:length="{/message1/len}" dfdl:lengthKind="explicit"
> />
>    </xsd:sequence>
>  </xsd:complexType>
> </xsd:element>
>
> 3) 'len' is set using outputValueCalc with reference 'data' (unpadded)
>
> - When 'len's outputValueCalc is encountered, it can not yet be evaluated
> as it depends on the length of 'data'
> - 'len' can not yet be set in augmented infoset
> - Problem as 'data's length expression can not be evaluated
> - But we do know the unpadded length of 'data' so 'len's outputValueCalc
> can now be evaluated
> - In turn this means that  'data's length expression can now be evaluated
>
> <xsd:element name="message1">
>  <xsd:complexType>
>    <xsd:sequence>
>      <xsd:element name="len" type="xsd:int"
>
> dfdl:outputValueCalc="{dfdl:valueLength(/message1/data)}"
> dfdl:lengthKind="explicit" dfdl:length="2" />
>      <xsd:element name="data" type="xsd:string"
>                   dfdl:length="{/message1/len}" dfdl:lengthKind="explicit"
> />
>    </xsd:sequence>
>  </xsd:complexType>
> </xsd:element>
>
> 4) 'len' is set using outputValueCalc with reference 'data' (padded)
>
> - When 'len's outputValueCalc is encountered, it can not yet be evaluated
> as it depends on the length of 'data'
> - 'len' can not yet be set in augmented infoset
> - Problem as 'data's length expression can not be evaluated
> - We don't know the padded length of 'data' because we don't know 'len'
> - *Problem*: 'data's length expression can never be evaluated
>
> <xsd:element name="message1">
>  <xsd:complexType>
>    <xsd:sequence>
>      <xsd:element name="len" type="xsd:int"
>
> dfdl:outputValueCalc="{dfdl:contentLength(/message1/data)}"
> dfdl:lengthKind="explicit" dfdl:length="2" />
>      <xsd:element name="data" type="xsd:string"
>                   dfdl:length="{/message1/len}" dfdl:lengthKind="explicit"
> />
>    </xsd:sequence>
>  </xsd:complexType>
> </xsd:element>
>
>
>
> Regards
>
> Steve Hanson
> Architect, Data Format Description Language (DFDL)
> Co-Chair, *OGF DFDL Working Group* <http://www.ogf.org/dfdl/>
> IBM SWG, Hursley, UK
> *smh at uk.ibm.com* <smh at uk.ibm.com>
> tel:*+44-1962-815848* <%2B44-1962-815848>
>
>
>
> From:        Steve Hanson/UK/IBM
> To:        *dfdl-wg at ogf.org* <dfdl-wg at ogf.org>
> Date:        04/09/2012 17:31
> Subject:        Fw: Behaviour for lengthKind 'endOfparent' is still not
> fully specified
>  ------------------------------
>
>
> DFDL WG call 4th Sept 2012:
>
> 1) Agreed that for binary data, only xs:hexBinary and packed/BCD allowed
> to have endOfParent
>
> 2) Agreed this is the correct behaviour when filling to a known length
>
> 3) Agreed this is the correct behaviour when filling to a known length
>
> 4) Agreed this is the correct behaviour when filling to a known length
>
> It was noted that lengthKind 'explicit' on the parent may not result in a
> known length if the length is an expression. This is an example of a more
> general chicken-and-egg situation with lengths given by expressions, for
> which outputValueCalc and DFDL functions unpaddedLength() were added can be
> used. Action raised to ensure that the behaviour of an implementation is
> fully defined by the spec.
>
> Regards
>
> Steve Hanson
> Architect, Data Format Description Language (DFDL)
> Co-Chair, *OGF DFDL Working Group* <http://www.ogf.org/dfdl/>
> IBM SWG, Hursley, UK
> *smh at uk.ibm.com* <smh at uk.ibm.com>
> tel:*+44-1962-815848* <%2B44-1962-815848>
> ----- Forwarded by Steve Hanson/UK/IBM on 04/09/2012 17:24 -----
>
> From:        Steve Hanson/UK/IBM
> To:        *dfdl-wg at ogf.org* <dfdl-wg at ogf.org> <*dfdl-wg at ogf.org*
> <dfdl-wg at ogf.org>>
> Date:        04/09/2012 14:08
> Subject:        Behaviour for lengthKind 'endOfparent' is still not fully
> specified
>  ------------------------------
>
>
> Noted when I reviewed latest spec - endOfParent and unparsing is not fully
> thought through.
>
> The spec today says that I can use endOfParent with binary data. There is
> a restriction in section 12.3.8, but it only applies when an element is
> endOfParent *and* its parent is lengthKind delimited.
>
> There are a couple of cases to consider:
>
> 1) Binary data of restricted length (see list in other email "proposed
> clarification/narrowing - delimited binary data should decimal"). I don't
> think it makes sense to allow these. We don't allow these binary reps for
> delimited.
>
> 2) Text data of variable length when unparsing. Box scenario. If the data
> in the infoset is shorter than the space in the box, what we do?  I think
> we should pad to box length with appropriate padChar, according to
> justification, as that is effectively a 'specified length'. Error if
> textPadKind is 'none'. Use parent's lengthUnits.
>
> 3) HexBinary data of variable length when unparsing. Box scenario. If the
> data in the infoset is shorter than the space in the box, what we do?  I
> think we should right-pad to box length with fill byte, as that is
> effectively a 'specified length'.
>
> 4) Packed/BCD binary data of variable length when unparsing. Box scenario.
> If the data in the infoset is shorter than the space in the box, what we
> do?  I think we should pad to box length with zero bytes, according to
> justification, as that is effectively a 'specified length'.  (Must be zero
> bytes and not fill byte as must be numeric in order to be parsed).
>
> In relation to 2 - 4, note that lengthKind 'endOfParent' can only be used
> with a parent lengthKind of 'explicit', 'pattern', 'prefixed' or
> 'endOfParent' or a choice with choiceLengthKind 'explicit', so the box
> scenario when unparsing therefore occurs only when lengthKind is 'explicit'
> or choiceLengthKind is 'explicit' - these are the cases when the length is
> known.  Also note that when there are nested 'endOfParent' elements (which
> is allowed) then all padding must be done on the simple element (ie, the
> innermost element), to ensure that what is output can be parsed.
>
> Regards
>
> Steve Hanson
> Architect, Data Format Description Language (DFDL)
> Co-Chair, *OGF DFDL Working Group* <http://www.ogf.org/dfdl/>
> IBM SWG, Hursley, UK
> *smh at uk.ibm.com* <smh at uk.ibm.com>
> tel:*+44-1962-815848* <%2B44-1962-815848>
>
> Unless stated otherwise above:
> IBM United Kingdom Limited - Registered in England and Wales with number
> 741598.
> Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU
>
>
> Unless stated otherwise above:
> IBM United Kingdom Limited - Registered in England and Wales with number
> 741598.
> Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU
> --
> dfdl-wg mailing list
> *dfdl-wg at ogf.org* <dfdl-wg at ogf.org>
> *https://www.ogf.org/mailman/listinfo/dfdl-wg*
> <https://www.ogf.org/mailman/listinfo/dfdl-wg>
>
> Unless stated otherwise above:
> IBM United Kingdom Limited - Registered in England and Wales with number
> 741598.
> Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU
> --
> dfdl-wg mailing list
> *dfdl-wg at ogf.org* <dfdl-wg at ogf.org>
> *https://www.ogf.org/mailman/listinfo/dfdl-wg*
> <https://www.ogf.org/mailman/listinfo/dfdl-wg>
> Unless stated otherwise above:
> IBM United Kingdom Limited - Registered in England and Wales with number
> 741598.
> Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU
>
>
> Unless stated otherwise above:
> IBM United Kingdom Limited - Registered in England and Wales with number
> 741598.
> Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU
>
> Unless stated otherwise above:
> IBM United Kingdom Limited - Registered in England and Wales with number
> 741598.
> Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU
>
> Unless stated otherwise above:
> IBM United Kingdom Limited - Registered in England and Wales with number
> 741598.
> Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU
>
> --
>  dfdl-wg mailing list
>  *dfdl-wg at ogf.org* <dfdl-wg at ogf.org>
>  *https://www.ogf.org/mailman/listinfo/dfdl-wg*
> <https://www.ogf.org/mailman/listinfo/dfdl-wg>
> --
> dfdl-wg mailing list
> *dfdl-wg at ogf.org* <dfdl-wg at ogf.org>
> *https://www.ogf.org/mailman/listinfo/dfdl-wg*
> <https://www.ogf.org/mailman/listinfo/dfdl-wg>
>
> Unless stated otherwise above:
> IBM United Kingdom Limited - Registered in England and Wales with number
> 741598.
> Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU
> --
>  dfdl-wg mailing list
>  *dfdl-wg at ogf.org* <dfdl-wg at ogf.org>
>  *https://www.ogf.org/mailman/listinfo/dfdl-wg*
> <https://www.ogf.org/mailman/listinfo/dfdl-wg>
>
> Unless stated otherwise above:
> IBM United Kingdom Limited - Registered in England and Wales with number
> 741598.
> Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU
>
> Unless stated otherwise above:
> IBM United Kingdom Limited - Registered in England and Wales with number
> 741598.
> Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU
>
> Unless stated otherwise above:
> IBM United Kingdom Limited - Registered in England and Wales with number
> 741598.
> Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU
>
> Unless stated otherwise above:
> IBM United Kingdom Limited - Registered in England and Wales with number
> 741598.
> Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU
>
> --
>   dfdl-wg mailing list
>   *dfdl-wg at ogf.org* <dfdl-wg at ogf.org>
>   *https://www.ogf.org/mailman/listinfo/dfdl-wg*
> <https://www.ogf.org/mailman/listinfo/dfdl-wg>
> [attachment "gwdrp-dfdl-v1.0.4-action276(1).docx" deleted by Steve
> Hanson/UK/IBM]
>
> Unless stated otherwise above:
> IBM United Kingdom Limited - Registered in England and Wales with number
> 741598.
> Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.ogf.org/pipermail/dfdl-wg/attachments/20150205/4fdc2b81/attachment-0001.html>