[DFDL-WG] choiceBranchKey - suggest change to List of DFDL String Literals ?

Steve Hanson smh at uk.ibm.com
Tue Jan 10 12:44:45 EST 2017


https://redmine.ogf.org/issues/326 created

Regards
 
Steve Hanson
IBM Hybrid Integration, Hursley, UK
Architect, IBM DFDL
Co-Chair, OGF DFDL Working Group
smh at uk.ibm.com
tel:+44-1962-815848
mob:+44-7717-378890



From:   Steve Hanson/UK/IBM
To:     Mike Beckerle <mbeckerle.dfdl at gmail.com>
Cc:     "dfdl-wg at ogf.org" <dfdl-wg at ogf.org>
Date:   13/12/2016 18:15
Subject:        Re: [DFDL-WG] choiceBranchKey - suggest change to List of 
DFDL String Literals ?


Yes it gets ugly and inefficient when you start having a large 'if' 
statement. 

An example of an IBM-schema with direct dispatch choice is SWIFT, which 
has dozens of messages but they all fall into 7 patterns so the expression 
'if' statement has 7 branches. Not seen anything more complex than that so 
far.

A List of DFDL String Literals is a possibility. 

Regards
 
Steve Hanson
IBM Hybrid Integration, Hursley, UK
Architect, IBM DFDL
Co-Chair, OGF DFDL Working Group
smh at uk.ibm.com
tel:+44-1962-815848
mob:+44-7717-378890




From:   Mike Beckerle <mbeckerle.dfdl at gmail.com>
To:     Steve Hanson/UK/IBM at IBMGB
Cc:     "dfdl-wg at ogf.org" <dfdl-wg at ogf.org>
Date:   13/12/2016 17:34
Subject:        Re: [DFDL-WG] choiceBranchKey - suggest change to List of 
DFDL String Literals ?



Yes this situation is that the expressions are big, because there are 
100's of branches, and so the constants are remote from where the branch 
is expressed. And it ends up being quite inefficient. What one wants is a 
table based lookup. Not an expression like

if (../key eq "11 or ../key eq "22") then "11"
else if (../key eq "AA" or ../key eq "BB" then "AA"
... so on for dozens of cases.

and so on. That's just a linear chain again, so equivalent to using 
discriminators on the branches.

An interesting point expressed to us is that users would like the DFDL 
schema to not only work to parse and unparse the file, but for it to serve 
as a declarative expression of the data format - as the machine-readable 
documentation of the format ultimately.

While a big expression will work, it's not as declarative as listing the 
possible dispatch keys for each branch right on that branch. A big 
expression really fails to capture the essence of the format in a manner 
that enables efficient implementation. 


Mike Beckerle | OGF DFDL Workgroup Co-Chair | Tresys Technology | 
www.tresys.com
Please note: Contributions to the DFDL Workgroup's email discussions are 
subject to the OGF Intellectual Property Policy


On Tue, Dec 13, 2016 at 11:01 AM, Steve Hanson <smh at uk.ibm.com> wrote:
Hi Mike 

If I recall, choiceBranchKey was a single DFDL String Literal because the 
value to be matched against it is computed by a DFDL expression (from 
choiceDispatchKey), which can do things like lower casing strings or other 
manipulations.  Are you seeing use cases where multiple keys per branch 
are making the expression too complex?

Regards
 
Steve Hanson 
IBM Hybrid Integration, Hursley, UK
Architect, IBM DFDL
Co-Chair, OGF DFDL Working Group
smh at uk.ibm.com
tel:+44-1962-815848
mob:+44-7717-378890 



From:        Mike Beckerle <mbeckerle.dfdl at gmail.com> 
To:        "dfdl-wg at ogf.org" <dfdl-wg at ogf.org> 
Date:        12/12/2016 19:48 
Subject:        [DFDL-WG] choiceBranchKey - suggest change to List of DFDL 
String        Literals ? 
Sent by:        "dfdl-wg" <dfdl-wg-bounces at ogf.org> 




Our first experience with requiring and implementing choiceBranchKey in 
Daffodil and we're already dealing with the fact that many times the same 
branch has multiple different keys all of which indicate that it should be 
selected.

Right now, dfdl:choiceBranchKey is a DFDL String Literal, so values like 
dfdl:choiceBranchKey="  " (that's two spaces) are legal and will work. If 
it was to be a whitespace separated value, then those would have to use 
DFDL Entities to express the whitespace.

I know I have seen COBOL data where there were multiple tags that denote 
the same structure.

I am curious what others experience with choiceBranchKey is, and how this 
issue was handled.  

I believe from prior emails that IBM has already implemented this in its 
DFDL implementation.

There is an erratum specifying that choiceBranchKey comparison is supposed 
to be case-sensitive (for performance reasons), but nothing heretofore 
about it using DFDL Entities to represent whitespace. 


Mike Beckerle | OGF DFDL Workgroup Co-Chair | Tresys Technology | 
www.tresys.com 
Please note: Contributions to the DFDL Workgroup's email discussions are 
subject to the OGF Intellectual Property Policy 
--
 dfdl-wg mailing list
 dfdl-wg at ogf.org
 https://www.ogf.org/mailman/listinfo/dfdl-wg 


Unless stated otherwise above:
IBM United Kingdom Limited - Registered in England and Wales with number 
741598. 
Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU



Unless stated otherwise above:
IBM United Kingdom Limited - Registered in England and Wales with number 
741598. 
Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU


Unless stated otherwise above:
IBM United Kingdom Limited - Registered in England and Wales with number 
741598. 
Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.ogf.org/pipermail/dfdl-wg/attachments/20170110/9aff7e08/attachment-0001.html>


More information about the dfdl-wg mailing list