[DFDL-WG] how to do mix of case sensitive and insensitive delimiters

Steve Hanson smh at uk.ibm.com
Thu Jun 20 08:50:58 EDT 2013


James

It's a balance. One of the design goals for DFDL was to keep the simple 
cases simple.  So forcing all users to understand and specify a host of 
properties that only exist to handle esoteric use cases when they have a 
straightforward use case is counter to that.  For example, in many years 
of modeling data I have only encountered one format where the encoding for 
an element value was different to the encoding of its initiator, and until 
your case below I had not encountered a use case where the case 
sensitivity of an element's initiator and terminator were different. Such 
cases clearly exist, but are not common place. Hence why the WG adopted 
the design it did, with encoding related properties on a per object basis. 
 

Another example is that DFDL has a single separator property, and not a 
separator for sequences and a separator for arrays. This often forces you 
to wrap an array element in a sequence in order to carry the separator, 
but it reduces the overall number of properties, behaviours and 
interactions that need to be understood.

For common use cases, we have allowed extra properties. For example, there 
are separate justification and padCharacter properties for each simple 
type category, because it is very common to have a mix of such types in 
data and the justification and pad for different types is invariably 
different for each type. 

Regards

Steve Hanson
Architect, IBM Data Format Description Language (DFDL)
Co-Chair, OGF DFDL Working Group
IBM SWG, Hursley, UK
smh at uk.ibm.com
tel:+44-1962-815848



From:   "Garriss Jr., James P." <jgarriss at mitre.org>
To:     "dfdl-wg at ogf.org" <dfdl-wg at ogf.org>, 
Date:   20/06/2013 12:50
Subject:        Re: [DFDL-WG] how to do mix of case sensitive   and 
insensitive     delimiters
Sent by:        dfdl-wg-bounces at ogf.org



So which is worse, Steve, a bunch of extra properties in 
GeneralPurposeFormat, or making your schema designers resort to 
workarounds like this:
 
        <xs:sequence dfdl:separator="" dfdl:terminator="end" 
dfdl:ignoreCase="no"> 
                <xs:element name="weird" type="xs:string" 
dfdl:initiator="start" dfdl:ignoreCase="yes" dfdl:lengthKind="delimited"/> 

        </xs:sequence>
 
Or this:
 
- wrap the element in a group 
- set dfdl:ignoreCase to 'no' on the group. 
- set dfdl:ignoreCase to 'yes' on the element. 
- put the terminator on the group and the initiator on the element.
 
Do you really want your schema designers to have to do things like this? 
It seems to me that a good design principle is to make life simpler for 
your users, not harder.
 
 
From: Steve Hanson [mailto:smh at uk.ibm.com] 
Sent: Thursday, June 20, 2013 7:33 AM
To: Garriss Jr., James P.
Cc: dfdl-wg at ogf.org; dfdl-wg-bounces at ogf.org; Tim Kimber
Subject: Re: [DFDL-WG] how to do mix of case sensitive and insensitive 
delimiters
 
If we adopt that approach we end up with an explosion of properties. It's 
not just ignoreCase. There's encoding and its related properties too. We 
decided that encoding related properties applied per object, not per 
delimiter. 

Regards

Steve Hanson
Architect, IBM Data Format Description Language (DFDL)
Co-Chair, OGF DFDL Working Group
IBM SWG, Hursley, UK
smh at uk.ibm.com
tel:+44-1962-815848 



From:        "Garriss Jr., James P." <jgarriss at mitre.org> 
To:        Tim Kimber/UK/IBM at IBMGB, "dfdl-wg at ogf.org" <dfdl-wg at ogf.org>, 
Date:        20/06/2013 12:21 
Subject:        Re: [DFDL-WG] how to do mix of case sensitive and 
insensitive        delimiters 
Sent by:        dfdl-wg-bounces at ogf.org 




All the various solutions proposed seem so convoluted.  Why not do the 
simple and obvious?  Something like: 
  
dfdl:ignoreInitiatorCase=”yes”  dfdl:ignoreTerminatorCase=”no” 
dfdl:ignoreSeparatorCase=”no”  dfdl:ignoreElementCase=”yes” 
  
From: dfdl-wg-bounces at ogf.org [mailto:dfdl-wg-bounces at ogf.org] On Behalf 
Of Tim Kimber
Sent: Thursday, June 20, 2013 5:34 AM
To: dfdl-wg at ogf.org
Subject: Re: [DFDL-WG] how to do mix of case sensitive and insensitive 
delimiters 
  
The other way to do this is 
- wrap the element in a group 
- set dfdl:ignoreCase to 'no' on the group. 
- set dfdl:ignoreCase to 'yes' on the element. 
- put the terminator on the group and the initiator on the element. 

Or the other way round if it works better that way. 

regards,

Tim Kimber, DFDL Team,
Hursley, UK
Internet:  kimbert at uk.ibm.com
Tel. 01962-816742 
Internal tel. 37246742




From:        Mike Beckerle <mbeckerle.dfdl at gmail.com> 
To:        dfdl-wg at ogf.org, 
Date:        19/06/2013 23:42 
Subject:        [DFDL-WG] how to do mix of case sensitive and insensitive 
delimiters 
Sent by:        dfdl-wg-bounces at ogf.org 






I have a wierd case where the initiator wants to be case insensitive 
matching, but the terminator wants to be case sensitive.

The only way I can think of dealing with this is to use the initiator, but 
handle the length via lengthKind='pattern' to grab the value, doing 
lookahead so it will stop before the terminator.
Then an empty sequence with a case sensitive terminator to pick off that 
part of the data stream. 

-- 
Mike Beckerle | OGF DFDL Workgroup Co-Chair | Tresys Technology | 
www.tresys.com
--
dfdl-wg mailing list
dfdl-wg at ogf.org
https://www.ogf.org/mailman/listinfo/dfdl-wg 

Unless stated otherwise above:
IBM United Kingdom Limited - Registered in England and Wales with number 
741598. 
Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU
--
 dfdl-wg mailing list
 dfdl-wg at ogf.org
 https://www.ogf.org/mailman/listinfo/dfdl-wg 

Unless stated otherwise above:
IBM United Kingdom Limited - Registered in England and Wales with number 
741598. 
Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU
--
  dfdl-wg mailing list
  dfdl-wg at ogf.org
  https://www.ogf.org/mailman/listinfo/dfdl-wg

Unless stated otherwise above:
IBM United Kingdom Limited - Registered in England and Wales with number 
741598. 
Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.ogf.org/pipermail/dfdl-wg/attachments/20130620/2f71d713/attachment-0001.html>


More information about the dfdl-wg mailing list