[DFDL-WG] Part 2 - Re: Action 307 - Demonstrate implementation interoperability

Steve Hanson smh at uk.ibm.com
Tue Oct 9 07:25:41 EDT 2018


Mike, responses in-line below.

Regards
 
Steve Hanson
IBM Hybrid Integration, Hursley, UK
Architect, IBM DFDL
Co-Chair, OGF DFDL Working Group
smh at uk.ibm.com
tel:+44-1962-815848
mob:+44-7717-378890
Note: I work Tuesday to Friday 



From:   Mike Beckerle <mbeckerle.dfdl at gmail.com>
To:     Steve Hanson <smh at uk.ibm.com>
Cc:     DFDL-WG <dfdl-wg at ogf.org>
Date:   04/10/2018 00:50
Subject:        Part 2 - Re: Action 307 - Demonstrate implementation 
interoperability




Based on Daffodil JIRA ticket backlog, and documentation at 
https://daffodil.apache.org/unsupported/, below are DFDL non-core features 
that are not supported by Daffodil, but that seem to be supported by IBM 
DFDL (based on my not finding anything that says they aren't implemented), 
and so are possibly in use in DFDL schemas we will need to use for 
interoperability testing.

Please advise if IBM DFDL does *not* implement any of these. 

* default, fixed - for defaulting values at parse time - Daffodil support 
for this is partial at parse time, unsupported at unparse time. The fixed 
attribute isn't supported at all.
* unordered sequences. IBM DFDL does not support default/fixed when 
parsing (see other thread).
* byte-value entities - in contexts other than fillByte
* ICU symbols 'u' and 'I' in calendarPattern
* binaryFloatRep 'ibm390Hex'
* documentFinalTerminatorCanBeMissing
* textStandardBase - with value not equal to 10
* lengthKind 'prefixed', and prefixIncludesPrefixLength, prefixLengthType 
- Note IBM restricts prefixLengthType to a type that itself cannot be 
prefixed. Correct.
* assert with failure type 'recoverableError'
* calendarObserveDST
* calendarCenturyStart
* textNumberPattern 'V' and 'P' symbols
* CCSID for specifying dfdl:encoding
* nilKind 'literalValue' for binary data
* choiceLengthKind 'explicit' and choiceLength
* separatorSuppressionPolicy - behaviors for these in Daffodil are known 
to be both non-standard currently and also different from IBM DFDL. This 
needs correcting. IBM DFDL does not support 'trailingEmptyStrict'
.

The above list (after review/correction) needs to be crossed with the 
published DFDL schemas on github that were published by IBM. The features 
required to run those DFDL schemas are required for Daffodil to implement 
before the interoperability demonstration. 


Below are features of DFDL I believe neither IBM DFDL nor Daffodil 
implement, and as they are non-core, they need not be implemented by 
either for interoperability testing:

* lengthKind 'endOfParent'
* nilKind 'logicalValue'  IBM DFDL implements this.
* occursCountKind 'stopValue' (and occursStopValue)
* textBiDi - and other related biDi properties
* useNilForDefault IBM DFDL implements this
* floating
* fn:exactlyOne function
* fn:namespace-URI() function
* dfdl:escapeCharacterPolicy 'delimiters' - daffodil doesn't implement 
this property at all.


Below are features of Daffodil that are not implemented by IBM DFDL and so 
cannot be used in schemas created using Daffodil that intend to be 
interoperable. These are either easy to work around, or impossible to work 
around, so are not a big deal, they just have to be kept in mind if 
considering a DFDL schema for use in interoperability testing. This 
includes schemas published on github, for image formats, CSV, etc.,  and a 
number of the FOUO schema published on DI2E.net/forge.mil - some of those 
quite possibly can work with IBM DFDL, and if they can do so, they should 
be modified so that they can be included in the interoperability testing. 

The list below mostly comes from 
https://www.ibm.com/support/knowledgecenter/en/SSMKHH_10.0.0/com.ibm.etools.mft.doc/df00150_.htm

* calendarTimeZone specified as "" (empty string)  - This is the most 
problematic one, so I've put it first. The predefined DFDL named format 
that is supplied with Daffodil and used as a starting point by most 
schemas has calendarTimeZone="". This is because customers didn't like 
that their datetimes were all being appended with "+00:00" for UTC time 
zone (in the infoset) when the data simply didn't specify a time zone. 
Schemas intended for interoperability testing should specify 'UTC' for 
this property.
* calendarTimeZone specified as an Olson format time zone
* inputValueCalc, outputValueCalc
* hiddenGroupRef
* Asserts and discriminators on simple type definitions or global element 
definitions
* fn:concat with more than 4 arguments
* non-8-bit charset encodings
* bitOrder not mostSignifcantBitFirst
* '@' in textNumberPattern (TBD: unsure if Daffodil has this)
* "_" in calendarLanguage
* calendarLanguage an expression
* assert & discriminator messages an expression 
* binaryBooleanTrueRep as "" (empty string) 
* checks for binary packed numbers with length units 'bits' and not a 
multiple of 4 length, and similarly for alignmentUnits bits and alignment 
not multiple of 4. (relevant to negative tests only)
* lengthKind 'implicit' complex elements inside lengthKind 'delimited' 
complex elements.

Additionally IBM DFDL contains these bugs in its expression processing:
1) Path locations are not correctly validated. Specifically, array 
elements without predicates and references into other choice branches are 
not flagged as errors.
2) In DFDL expression functions, the namespace prefixes for  
http://www.w3.org/2001/XMLSchema and 
http://www.w3.org/2005/xpath-functions" must be 'xs' and 'fn' 
respectively, even if not declared.
3) In DFDL expressions, namespace prefixes in paths are ignored and 
matching is against element name only. 
 
For interoperability testing therefore:
- For 1) avoid the use of either example 
- For 2) always declare xmlns:xs and xmlns:fn and always use those 
prefixes in expressions
- For 3) avoid sibling elements that have same name but different 
namespace; use elementFormDefault="unqualified" to avoid namespaces for 
local elements altogether 

Unless stated otherwise above:
IBM United Kingdom Limited - Registered in England and Wales with number 
741598. 
Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.ogf.org/pipermail/dfdl-wg/attachments/20181009/a9027305/attachment.html>


More information about the dfdl-wg mailing list