[DFDL-WG] clarification needed - unparser - do we use mandatory text alignment for zero-length strings?

Steve Hanson smh at uk.ibm.com
Fri Jan 22 04:35:32 EST 2016


Hi Mike

If we are unparsing an occurrence then yes we apply alignment properties 
before we output the content, even if the content is zero length. I 
emphasise "if we are unparsing an occurrence" because if an element is 
optional and there is no occurrence in the infoset we will not unparse 
anything, so the alignment is not relevant and not used.

Regards
 
Steve Hanson
Architect, IBM DFDL
Co-Chair, OGF DFDL Working Group
IBM Integration Bus, Hursley, UK
smh at uk.ibm.com
tel:+44-1962-815848
mob:+44-7717-378890



From:   Mike Beckerle <mbeckerle.dfdl at gmail.com>
To:     "dfdl-wg at ogf.org" <dfdl-wg at ogf.org>
Date:   21/01/2016 22:37
Subject:        [DFDL-WG] clarification needed - unparser - do we use 
mandatory text alignment for zero-length strings?
Sent by:        "dfdl-wg" <dfdl-wg-bounces at ogf.org>




Suppose field 1 is 3 bits long. 

Suppose field 2 is a string in utf-8. 
dfdl:emptyValueDelimiterPolicy='none'

If field2 has any contents, then those characters must begin on a byte 
boundary, so before the first character we will skip 5 bits.

What if field2 is of length 0 ? Do we skip 5 bits anyway or not?

Note that if dfdl:emptyValueDelimiterPolicy is "both", and an initator or 
terminator is defined, then even if length is 0 we definitely will unparse 
a character at least so the 5 bits of alignment are needed in that case.

But if there is no framing, and no content, do we align to the text 
mandatory alignment or not?

Now, I believe the answer to this should be "yes" we skip the 5 bits 
anyway. Because if field 2 is lengthKind='pattern', then when parsing, we 
have to scan for a match to the pattern, and so we have to know where to 
start the scan, which has to be at a byte boundary. So even if there is no 
match, so the length at parse time is zero, we still had to skip 5 bits 
before we could start scanning.

So to be consistent with this, I believe unparsing must also output the 5 
bits regardless of whether the string is length 0 or not. 

So there's what should it do, but also if all the constructs for this work 
in IBM DFDL, I'm curious what it actually does.

Thanks

Mike Beckerle | OGF DFDL Workgroup Co-Chair | Tresys Technology | 
www.tresys.com
Please note: Contributions to the DFDL Workgroup's email discussions are 
subject to the OGF Intellectual Property Policy
--
  dfdl-wg mailing list
  dfdl-wg at ogf.org
  https://www.ogf.org/mailman/listinfo/dfdl-wg

Unless stated otherwise above:
IBM United Kingdom Limited - Registered in England and Wales with number 
741598. 
Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.ogf.org/pipermail/dfdl-wg/attachments/20160122/e13aaca2/attachment.html>


More information about the dfdl-wg mailing list