[DFDL-WG] clarification needed - unparser - do we use mandatory text alignment for zero-length strings?

Mike Beckerle mbeckerle.dfdl at gmail.com
Thu Jan 21 17:29:51 EST 2016


Suppose field 1 is 3 bits long.

Suppose field 2 is a string in utf-8. dfdl:emptyValueDelimiterPolicy='none'

If field2 has any contents, then those characters must begin on a byte
boundary, so before the first character we will skip 5 bits.

What if field2 is of length 0 ? Do we skip 5 bits anyway or not?

Note that if dfdl:emptyValueDelimiterPolicy is "both", and an initator or
terminator is defined, then even if length is 0 we definitely will unparse
a character at least so the 5 bits of alignment are needed in that case.

But if there is no framing, and no content, do we align to the text
mandatory alignment or not?

Now, I believe the answer to this should be "yes" we skip the 5 bits
anyway. Because if field 2 is lengthKind='pattern', then when parsing, we
have to scan for a match to the pattern, and so we have to know where to
start the scan, which has to be at a byte boundary. So even if there is no
match, so the length at parse time is zero, we still had to skip 5 bits
before we could start scanning.

So to be consistent with this, I believe unparsing must also output the 5
bits regardless of whether the string is length 0 or not.

So there's what should it do, but also if all the constructs for this work
in IBM DFDL, I'm curious what it actually does.

Thanks

Mike Beckerle | OGF DFDL Workgroup Co-Chair | Tresys Technology |
www.tresys.com
Please note: Contributions to the DFDL Workgroup's email discussions are
subject to the OGF Intellectual Property Policy
<http://www.ogf.org/About/abt_policies.php>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.ogf.org/pipermail/dfdl-wg/attachments/20160121/bd4e1549/attachment.html>


More information about the dfdl-wg mailing list