[DFDL-WG] How to determine the length of an element which has text representation

DFDL mbeckerle.dfdl at gmail.com
Wed Nov 18 19:52:23 CST 2009


I support tim's view here. There needs to be an idiomatic way to shut  
off scanning. Rep='binary' is much too obscure.

Question: which other length kinds should switch off scanning? Prefix?  
Implicit? None of these?

...mikeb


On Nov 18, 2009, at 12:05 PM, Tim Kimber <KIMBERT at uk.ibm.com> wrote:

>
> I'd like to record what was discussed and raise another point which  
> Alan pointed out after meeting,
>
> Discussions in the meeting
> - dfdl:lengthKind applies only to the element on which it is  
> specified. It has no effect whatever on the parsing of child  
> elements/groups.
> - there may be some value in tolerating simple elements of type  
> xs:string with dfdl:representation="binary". Might be useful for  
> schemas where dfdl:representation="binary" throughout.
> - Currently, the position of the WG is that parsers should *always*  
> scan to extract the text representation if there is any terminating  
> markup in scope. Even if lengthKind='explicit'.
> - TK proposed the scheme outlined in his previous email, in which  
> dfdl:lengthKind alone specifies how the parser should extract the  
> text representation.
> If lengthKind="explicit", scanning is switched off and dfdl:length  
> is used. If lengthKind="delimited" the text rep is extracted by  
> scanning and length is ignored.
> - A refinement was discussed whereby dfdl:length would be checked  
> after a scan has been performed if dfdl:lengthKind="delimited". This  
> would make the modeling of some common formats simpler, and avoid  
> the need for a dfdl:assert to enforce the length constraint.
> - MB raised the possibility that we could actually disallow  
> dfdl:length if lengthKind='delimited'. This is the most conservative  
> position, but general opinion was that it would be too restrictive.  
> There still might be some value in disallowing dfdl:length for other  
> lengthKinds.
>
> Discussions after the meeting
> - Alan pointed out that lengthKind="explicit" does not necessarily  
> mean that the length of the field is fixed. dfdl:length might be  
> specified as a DFDL expression. A common reason for doing that would  
> be to obtain the element's length from an earlier integer field. As  
> currently specified, if there was any markup in scope, the text rep  
> would be extracted by scanning.
>
> Restatement of my position after today's meeting:
> I'm now even more convinced that dfdl:lengthKind="explicit" should  
> switch off scanning. Here's why:
> a) The enumerations of lengthKind are explicit, implicit, prefixed,  
> delimited,  pattern, endOfParent. The presence of 'delimited' in  
> that list means that in some users' minds, the other enumerations  
> are going to be interpreted as *alternatives* to 'delimited'.
> b) If there's markup in scope, scanning cannot be switched off by  
> any means. Not even by setting lengthKind='explicit' AND obtaining  
> dfdl:length from a previous integer field. I think that's very  
> counter-intuitive.
>
> regards,
>
> Tim Kimber, Common Transformation Team,
> Hursley, UK
> Internet:  kimbert at uk.ibm.com
> Tel. 01962-816742
> Internal tel. 246742
>
>
>
>
>
> Unless stated otherwise above:
> IBM United Kingdom Limited - Registered in England and Wales with  
> number 741598.
> Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire  
> PO6 3AU
>
>
>
>
>
>
>
> --
>  dfdl-wg mailing list
>  dfdl-wg at ogf.org
>  http://www.ogf.org/mailman/listinfo/dfdl-wg
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.ogf.org/pipermail/dfdl-wg/attachments/20091118/62879167/attachment.html 


More information about the dfdl-wg mailing list