[DFDL-WG] need clarification on unparsing with textPadKind='padChar'

Steve Hanson smh at uk.ibm.com
Tue Aug 23 08:27:46 EDT 2016


Erratum 5.18 is not yet incorporated into the spec. When it is then 
dfdl:lengthKind 'explicit' with an expression becomes the same as 
dfdl:lengthKind 'explicit' with a literal.  I believe that means the 
expression result is the fixed length for unparsing, and minLength is only 
used for validation.  However I would like to run a test with IBM DFDL to 
make sure everything is as expected.

fn:string-length() always returns the length of the value in the infoset 
item. It has no schema awareness.

Regards
 
Steve Hanson
IBM Integration Bus, Hursley, UK
Architect, IBM DFDL
Co-Chair, OGF DFDL Working Group
smh at uk.ibm.com
tel:+44-1962-815848
mob:+44-7717-378890



From:   Mike Beckerle <mbeckerle.dfdl at gmail.com>
To:     "dfdl-wg at ogf.org" <dfdl-wg at ogf.org>
Date:   10/08/2016 17:27
Subject:        [DFDL-WG] need clarification on unparsing with 
textPadKind='padChar'
Sent by:        "dfdl-wg" <dfdl-wg-bounces at ogf.org>




The spec for textPadKind has this sentence about unparsing:

"When dfdl:lengthKind is 'explicit' (and dfdl:length is an expression), 
'delimited', 'prefixed', 'pattern' the data value is padded to the length 
given by the XSD minLength facet for type 'xs:string' or 
dfdl:textOutputMinLength  property for other types."

My concern is the case where dfdl:lengthKind is 'explicit' and dfdl:length 
is an expression. 

Example 1:

Suppose my infoset element contains an element at path '../x' of type  
xs:string containing 10 characters "1234567890"

Suppose the length expression evaluates to 20, and lengthUnits is 
'characters'

Suppose the XSD minLength facet is 100.

Suppose the textPadKind is padChar.

The above sentence says the target length would be 100 and we would pad to 
length 100, not 20, and there is no error. 

I want to be sure this is correct and what we intended.

To me this seems inconsistent with what we have stated in Erratum 5.18 
which says that whether or not the length is a constant or an expression, 
this element is treated as fixed length, and when unparsing, the length 
expression is evaluated, not ignored. 

It's not  technically incorrect in erratum 5.18 because that doesn't deal 
with this minLength complexity.  It's just not terribly clear. We may want 
to update this sentence of the spec, and the erratum to both use the term 
target length, which is in our glossary. 

Furthermore, 

In this case, the dfdl:contentLength for this element should return 100 
for units 'characters'.

In this case the dfdl:valueLength for this element should return 10 for 
units 'characters'

Expression fn:string-length(../x) should return 10. 

The length expression value of 20 is effectively being ignored because it 
is less than the minLength. The length expression *would* be used if it 
was greater than the minLength.

Example 2:

My infoset contains an element at path "../x" of type xs:string of length 
200

The length expression evaluates to 20

dfdl:truncateSpecifiedLengthString is true

XSD minLength is 40.

In this case I believe we would have a target length of 40, and we would 
truncate the string to length 40. 

dfdl:contentLength of this element is 40

dfdl:valueLength of this element is 40

Expression fn:string-length(../x) should return 200. 

Is this the correct interpretation?

Example 3:

Same as example 2, except XSD minLength is 0.

Now the target length would be 20 and we would truncate the length to 20.

dfdl:contentLength of this element is 20 characters

dfdl:valueLength of this element is 20 characters.

Expression fn:string-length(../x) should return 200. 

Is this the correct interpretation?



Mike Beckerle | OGF DFDL Workgroup Co-Chair | Tresys Technology | 
www.tresys.com
Please note: Contributions to the DFDL Workgroup's email discussions are 
subject to the OGF Intellectual Property Policy
--
  dfdl-wg mailing list
  dfdl-wg at ogf.org
  https://www.ogf.org/mailman/listinfo/dfdl-wg


Unless stated otherwise above:
IBM United Kingdom Limited - Registered in England and Wales with number 
741598. 
Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.ogf.org/pipermail/dfdl-wg/attachments/20160823/579feedd/attachment.html>


More information about the dfdl-wg mailing list