[DFDL-WG] need clarification on unparsing with textPadKind='padChar'

Mike Beckerle mbeckerle.dfdl at gmail.com
Wed Aug 10 12:26:59 EDT 2016


The spec for textPadKind has this sentence about unparsing:

"When dfdl:lengthKind is 'explicit' (and dfdl:length is an expression),
'delimited', 'prefixed', 'pattern' the data value is padded to the length
given by the XSD minLength facet for type 'xs:string' or
dfdl:textOutputMinLength  property for other types."


My concern is the case where dfdl:lengthKind is 'explicit' and dfdl:length
is an expression.


Example 1:


Suppose my infoset element contains an element at path '../x' of type
xs:string containing 10 characters "1234567890"


Suppose the length expression evaluates to 20, and lengthUnits is
'characters'


Suppose the XSD minLength facet is 100.


Suppose the textPadKind is padChar.


The above sentence says the target length would be 100 and we would pad to
length 100, not 20, and there is no error.


I want to be sure this is correct and what we intended.


To me this seems inconsistent with what we have stated in Erratum 5.18
which says that whether or not the length is a constant or an expression,
this element is treated as fixed length, and when unparsing, the length
expression is evaluated, not ignored.


It's not  technically incorrect in erratum 5.18 because that doesn't deal
with this minLength complexity.  It's just not terribly clear. We may want
to update this sentence of the spec, and the erratum to both use the term
target length, which is in our glossary.


Furthermore,


In this case, the dfdl:contentLength for this element should return 100 for
units 'characters'.


In this case the dfdl:valueLength for this element should return 10 for
units 'characters'


Expression fn:string-length(../x) should return 10.


The length expression value of 20 is effectively being ignored because it
is less than the minLength. The length expression *would* be used if it was
greater than the minLength.


Example 2:


My infoset contains an element at path "../x" of type xs:string of length
200


The length expression evaluates to 20


dfdl:truncateSpecifiedLengthString is true


XSD minLength is 40.


In this case I believe we would have a target length of 40, and we would
truncate the string to length 40.


dfdl:contentLength of this element is 40


dfdl:valueLength of this element is 40


Expression fn:string-length(../x) should return 200.


Is this the correct interpretation?


Example 3:


Same as example 2, except XSD minLength is 0.


Now the target length would be 20 and we would truncate the length to 20.


dfdl:contentLength of this element is 20 characters


dfdl:valueLength of this element is 20 characters.


Expression fn:string-length(../x) should return 200.


Is this the correct interpretation?




Mike Beckerle | OGF DFDL Workgroup Co-Chair | Tresys Technology |
www.tresys.com
Please note: Contributions to the DFDL Workgroup's email discussions are
subject to the OGF Intellectual Property Policy
<http://www.ogf.org/About/abt_policies.php>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.ogf.org/pipermail/dfdl-wg/attachments/20160810/ac2c96ce/attachment.html>


More information about the dfdl-wg mailing list