[DFDL-WG] Action 242 - propose simplification for length units in dfdl:valuelength and dfdl:contentLength functions

Steve Hanson smh at uk.ibm.com
Mon Apr 28 06:29:40 EDT 2014


Mike

I don't think you can always ensure the result is well-defined. 

The element (first argument) is silent about its lengthUnits when not 
applicable to is lengthKind - specifically 'pattern', 'delimited', 
'implicit' (complex) and 'endOfParent'.  For the first three we need to 
allow lengthUnits 'characters' to be provided as the second argument. 

Note for 'endOfParent' the length can't be calculated so it should be an 
SDE if the function is called against that element ?

It should be an SDE if lengthUnits 'characters' is provided as the second 
argument and the representation is 'binary' or the type is hexBinary.

Regards
 
Steve Hanson
Architect, IBM DFDL
Co-Chair, OGF DFDL Working Group
IBM SWG, Hursley, UK
smh at uk.ibm.com
tel:+44-1962-815848



From:   Mike Beckerle <mbeckerle.dfdl at gmail.com>
To:     "dfdl-wg at ogf.org" <dfdl-wg at ogf.org>, 
Date:   24/04/2014 23:41
Subject:        [DFDL-WG] Action 242 - propose simplification for length 
units in dfdl:valuelength and dfdl:contentLength functions
Sent by:        dfdl-wg-bounces at ogf.org



These functions currently take the desired result units as the second 
argument.

It is very complicated if the length units of the element are bytes, to 
ask for this length to be recast in characters due to the 
fragment-of-a-character on the end problem. In some sense there is 
quotient/remainder here, which means there are multiple operations here, 
not just one. There is the equivalent of floor (as many characters as fit, 
then fillByte) ceiling (every byte filled from some character), remainder 
(how many left over bytes are there). I think this is silly unless we have 
use cases driving this.

I believe this restriction matches all use-cases I can think of:

Restriction: If the element (first argument) to either valueLength or 
contentLength has length units of

characters -> can ask for valueLength or contentLength in any of 
characters, bytes, or bits
bytes -> can ask for valueLength or contentLength in bytes or bits
bits -> can only ask for valueLength or contentLength only in bits.

and it is an SDE otherwise. 

This insures that the result of these functions is always well defined.

I understand why there is a box sized in characters and I need to know its 
size in bytes, but the opposite I can't see a use case for, and it isn't 
crisply defined even. 
Ditto for length in bits. If the box is sized in bytes I might want that 
in bits (so * 8), but if the box is sized in bits, and I ask for bytes, 
what do I want if there is a fragment of a byte? ceiling? floor? 
remainder?

Thoughts?


Mike Beckerle | OGF DFDL Workgroup Co-Chair | Tresys Technology | 
www.tresys.com
Please note: Contributions to the DFDL Workgroup's email discussions are 
subject to the OGF Intellectual Property Policy
--
  dfdl-wg mailing list
  dfdl-wg at ogf.org
  https://www.ogf.org/mailman/listinfo/dfdl-wg

Unless stated otherwise above:
IBM United Kingdom Limited - Registered in England and Wales with number 
741598. 
Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.ogf.org/pipermail/dfdl-wg/attachments/20140428/e57702ee/attachment-0001.html>


More information about the dfdl-wg mailing list