[DFDL-WG] how to do mix of case sensitive and insensitive delimiters
Steve Hanson
smh at uk.ibm.com
Thu Jun 20 08:50:58 EDT 2013
James
It's a balance. One of the design goals for DFDL was to keep the simple
cases simple. So forcing all users to understand and specify a host of
properties that only exist to handle esoteric use cases when they have a
straightforward use case is counter to that. For example, in many years
of modeling data I have only encountered one format where the encoding for
an element value was different to the encoding of its initiator, and until
your case below I had not encountered a use case where the case
sensitivity of an element's initiator and terminator were different. Such
cases clearly exist, but are not common place. Hence why the WG adopted
the design it did, with encoding related properties on a per object basis.
Another example is that DFDL has a single separator property, and not a
separator for sequences and a separator for arrays. This often forces you
to wrap an array element in a sequence in order to carry the separator,
but it reduces the overall number of properties, behaviours and
interactions that need to be understood.
For common use cases, we have allowed extra properties. For example, there
are separate justification and padCharacter properties for each simple
type category, because it is very common to have a mix of such types in
data and the justification and pad for different types is invariably
different for each type.
Regards
Steve Hanson
Architect, IBM Data Format Description Language (DFDL)
Co-Chair, OGF DFDL Working Group
IBM SWG, Hursley, UK
smh at uk.ibm.com
tel:+44-1962-815848
From: "Garriss Jr., James P." <jgarriss at mitre.org>
To: "dfdl-wg at ogf.org" <dfdl-wg at ogf.org>,
Date: 20/06/2013 12:50
Subject: Re: [DFDL-WG] how to do mix of case sensitive and
insensitive delimiters
Sent by: dfdl-wg-bounces at ogf.org
So which is worse, Steve, a bunch of extra properties in
GeneralPurposeFormat, or making your schema designers resort to
workarounds like this:
<xs:sequence dfdl:separator="" dfdl:terminator="end"
dfdl:ignoreCase="no">
<xs:element name="weird" type="xs:string"
dfdl:initiator="start" dfdl:ignoreCase="yes" dfdl:lengthKind="delimited"/>
</xs:sequence>
Or this:
- wrap the element in a group
- set dfdl:ignoreCase to 'no' on the group.
- set dfdl:ignoreCase to 'yes' on the element.
- put the terminator on the group and the initiator on the element.
Do you really want your schema designers to have to do things like this?
It seems to me that a good design principle is to make life simpler for
your users, not harder.
From: Steve Hanson [mailto:smh at uk.ibm.com]
Sent: Thursday, June 20, 2013 7:33 AM
To: Garriss Jr., James P.
Cc: dfdl-wg at ogf.org; dfdl-wg-bounces at ogf.org; Tim Kimber
Subject: Re: [DFDL-WG] how to do mix of case sensitive and insensitive
delimiters
If we adopt that approach we end up with an explosion of properties. It's
not just ignoreCase. There's encoding and its related properties too. We
decided that encoding related properties applied per object, not per
delimiter.
Regards
Steve Hanson
Architect, IBM Data Format Description Language (DFDL)
Co-Chair, OGF DFDL Working Group
IBM SWG, Hursley, UK
smh at uk.ibm.com
tel:+44-1962-815848
From: "Garriss Jr., James P." <jgarriss at mitre.org>
To: Tim Kimber/UK/IBM at IBMGB, "dfdl-wg at ogf.org" <dfdl-wg at ogf.org>,
Date: 20/06/2013 12:21
Subject: Re: [DFDL-WG] how to do mix of case sensitive and
insensitive delimiters
Sent by: dfdl-wg-bounces at ogf.org
All the various solutions proposed seem so convoluted. Why not do the
simple and obvious? Something like:
dfdl:ignoreInitiatorCase=”yes” dfdl:ignoreTerminatorCase=”no”
dfdl:ignoreSeparatorCase=”no” dfdl:ignoreElementCase=”yes”
From: dfdl-wg-bounces at ogf.org [mailto:dfdl-wg-bounces at ogf.org] On Behalf
Of Tim Kimber
Sent: Thursday, June 20, 2013 5:34 AM
To: dfdl-wg at ogf.org
Subject: Re: [DFDL-WG] how to do mix of case sensitive and insensitive
delimiters
The other way to do this is
- wrap the element in a group
- set dfdl:ignoreCase to 'no' on the group.
- set dfdl:ignoreCase to 'yes' on the element.
- put the terminator on the group and the initiator on the element.
Or the other way round if it works better that way.
regards,
Tim Kimber, DFDL Team,
Hursley, UK
Internet: kimbert at uk.ibm.com
Tel. 01962-816742
Internal tel. 37246742
From: Mike Beckerle <mbeckerle.dfdl at gmail.com>
To: dfdl-wg at ogf.org,
Date: 19/06/2013 23:42
Subject: [DFDL-WG] how to do mix of case sensitive and insensitive
delimiters
Sent by: dfdl-wg-bounces at ogf.org
I have a wierd case where the initiator wants to be case insensitive
matching, but the terminator wants to be case sensitive.
The only way I can think of dealing with this is to use the initiator, but
handle the length via lengthKind='pattern' to grab the value, doing
lookahead so it will stop before the terminator.
Then an empty sequence with a case sensitive terminator to pick off that
part of the data stream.
--
Mike Beckerle | OGF DFDL Workgroup Co-Chair | Tresys Technology |
www.tresys.com
--
dfdl-wg mailing list
dfdl-wg at ogf.org
https://www.ogf.org/mailman/listinfo/dfdl-wg
Unless stated otherwise above:
IBM United Kingdom Limited - Registered in England and Wales with number
741598.
Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU
--
dfdl-wg mailing list
dfdl-wg at ogf.org
https://www.ogf.org/mailman/listinfo/dfdl-wg
Unless stated otherwise above:
IBM United Kingdom Limited - Registered in England and Wales with number
741598.
Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU
--
dfdl-wg mailing list
dfdl-wg at ogf.org
https://www.ogf.org/mailman/listinfo/dfdl-wg
Unless stated otherwise above:
IBM United Kingdom Limited - Registered in England and Wales with number
741598.
Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.ogf.org/pipermail/dfdl-wg/attachments/20130620/2f71d713/attachment-0001.html>
More information about the dfdl-wg
mailing list