[DFDL-WG] clarification needed - unparser - do we use mandatory text alignment for zero-length strings?

Mike Beckerle mbeckerle.dfdl at gmail.com
Fri Jan 22 09:42:40 EST 2016


Thanks. I think the simplest way to say this is that the element is aligned
because it is representation text and the charset has a mandatory
alignment, the element content is of length 0.

The alignment of the element and the content length are orthogonal.

Mike Beckerle | OGF DFDL Workgroup Co-Chair | Tresys Technology |
www.tresys.com
Please note: Contributions to the DFDL Workgroup's email discussions are
subject to the OGF Intellectual Property Policy
<http://www.ogf.org/About/abt_policies.php>


On Fri, Jan 22, 2016 at 4:35 AM, Steve Hanson <smh at uk.ibm.com> wrote:

> Hi Mike
>
> If we are unparsing an occurrence then yes we apply alignment properties
> before we output the content, even if the content is zero length. I
> emphasise "if we are unparsing an occurrence" because if an element is
> optional and there is no occurrence in the infoset we will not unparse
> anything, so the alignment is not relevant and not used.
>
> Regards
>
> Steve Hanson
> Architect, *IBM DFDL*
> <http://www.ibm.com/developerworks/library/se-dfdl/index.html>
> Co-Chair, *OGF DFDL Working Group* <http://www.ogf.org/dfdl/>
> *IBM Integration Bus*
> <http://www-03.ibm.com/software/products/en/ibm-integration-bus>,
> Hursley, UK
> *smh at uk.ibm.com* <smh at uk.ibm.com>
> tel:+44-1962-815848
> mob:+44-7717-378890
>
>
>
> From:        Mike Beckerle <mbeckerle.dfdl at gmail.com>
> To:        "dfdl-wg at ogf.org" <dfdl-wg at ogf.org>
> Date:        21/01/2016 22:37
> Subject:        [DFDL-WG] clarification needed - unparser - do we use
> mandatory text alignment for zero-length strings?
> Sent by:        "dfdl-wg" <dfdl-wg-bounces at ogf.org>
> ------------------------------
>
>
>
>
> Suppose field 1 is 3 bits long.
>
> Suppose field 2 is a string in utf-8. dfdl:emptyValueDelimiterPolicy='none'
>
> If field2 has any contents, then those characters must begin on a byte
> boundary, so before the first character we will skip 5 bits.
>
> What if field2 is of length 0 ? Do we skip 5 bits anyway or not?
>
> Note that if dfdl:emptyValueDelimiterPolicy is "both", and an initator or
> terminator is defined, then even if length is 0 we definitely will unparse
> a character at least so the 5 bits of alignment are needed in that case.
>
> But if there is no framing, and no content, do we align to the text
> mandatory alignment or not?
>
> Now, I believe the answer to this should be "yes" we skip the 5 bits
> anyway. Because if field 2 is lengthKind='pattern', then when parsing, we
> have to scan for a match to the pattern, and so we have to know where to
> start the scan, which has to be at a byte boundary. So even if there is no
> match, so the length at parse time is zero, we still had to skip 5 bits
> before we could start scanning.
>
> So to be consistent with this, I believe unparsing must also output the 5
> bits regardless of whether the string is length 0 or not.
>
> So there's what should it do, but also if all the constructs for this work
> in IBM DFDL, I'm curious what it actually does.
>
> Thanks
>
> Mike Beckerle | OGF DFDL Workgroup Co-Chair | Tresys Technology |
> *www.tresys.com* <http://www.tresys.com/>
> Please note: Contributions to the DFDL Workgroup's email discussions are
> subject to the *OGF Intellectual Property Policy*
> <http://www.ogf.org/About/abt_policies.php>
> --
>  dfdl-wg mailing list
>  dfdl-wg at ogf.org
>  https://www.ogf.org/mailman/listinfo/dfdl-wg
>
> Unless stated otherwise above:
> IBM United Kingdom Limited - Registered in England and Wales with number
> 741598.
> Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.ogf.org/pipermail/dfdl-wg/attachments/20160122/6675a585/attachment-0001.html>


More information about the dfdl-wg mailing list