[DFDL-WG] dfdl-wg Digest, Vol 47, Issue 1

Tim Kimber KIMBERT at uk.ibm.com
Thu Jul 1 10:02:06 CDT 2010


re: 6.

6 Remove timing from dfdl:assert. 
As timing property has been removed from discriminators it is inconsistent 
to still have it on assert. Agreed 
There was also a discussion about adding a message property to 
discriminators which was inconclusive so decided not to change. 

I remembered the reason why I thought this was a good idea.
Consider the situation where someone is generating their DFDL schema from 
meta-data. The model is large, and consists of many references to global 
structures. Each global structure ( e.g. an HL7 segment ) is identified in 
a particular way. Sometimes the segment is required, sometimes it is not. 
Sometimes it occurs as a child of a choice group, and sometimes not. 
Regardless, it is highly likely that the segment will be identified in the 
same way wherever it occurs. A natural decision for the modeler would be 
to create a dfdl:discriminator on all references to the segement, even if 
the ref is not under a point of uncertainty. It's harmless, and it carries 
no performance penalty. If we disallow the "message" attribute, it will 
force the modeler to put in extra logic to work out whether the ref is 
under a POI, and generate an assert/discriminator as appropriate.

I'd be interested to know what Steph thinks about this - I think I've 
heard her say that she sometimes uses discriminators where an assert would 
have done the job, just to maintain consistency throughout the model.

regards,

Tim Kimber, Common Transformation Team,
Hursley, UK
Internet:  kimbert at uk.ibm.com
Tel. 01962-816742 
Internal tel. 246742




From:
dfdl-wg-request at ogf.org
To:
dfdl-wg at ogf.org
Date:
01/07/2010 14:47
Subject:
dfdl-wg Digest, Vol 47, Issue 1



Send dfdl-wg mailing list submissions to
                 dfdl-wg at ogf.org

To subscribe or unsubscribe via the World Wide Web, visit
                 http://www.ogf.org/mailman/listinfo/dfdl-wg
or, via email, send a message with subject or body 'help' to
                 dfdl-wg-request at ogf.org

You can reach the person managing the list at
                 dfdl-wg-owner at ogf.org

When replying, please edit your Subject line so it is more specific
than "Re: Contents of dfdl-wg digest..."
Today's Topics:

   1. Minute for OGF DFDL Working Group Call, June 30-2010 (Alan Powell)


----- Message from Alan Powell <alan_powell at uk.ibm.com> on Thu, 1 Jul 2010 
14:47:07 +0100 -----
To:
dfdl-wg at ogf.org
Subject:
[DFDL-WG] Minute for OGF DFDL Working Group Call, June 30-2010



Open Grid Forum: Data Format Description Language Working Group

OGF DFDL Working Group Call, June 30-2010

Attendees 
Steve Hanson (IBM) 
Alan Powell (IBM) 
Stephanie Fetzer (IBM) 
Tim Kimber(IBM) 

Apologies 
Suman Kalia (IBM) 
Mike Beckerle (Oco) 

1. Current Actions 
Updated below 

2 Nils and Defaults. 
Reviewed Alan's latest draft. 
Agreed with definition of 'required parent context' 
Preferred the combining of initiated and not initiated rules. 
Would still like a separate definition of nillable 
Minor editorial changes.

3  DFDL property types and other issues. 
Not discussed

Tim has proposed more specific types for some properties. In particular 
separating the different kinds of entities. 
- Modify the meaning and usage of the type 'DFDL String Literal' and 
modify the remainder of the specification accordingly 
- Improve the description of DFDL entities to avoid confusion over the 
intended usage of raw byte values. 
- Clarify the standard sentence about forward references in DFDL 
expressions - the current text implied that the restrictions only applied 
to the unparser. 
1.1        DFDL Properties 
Properties on DFDL annotations may be one or more of the following types 

·        DFDL string literal
The property value is a string that represents describes a sequence of 
literal bytes and or characters which appear in the data stream. 
·        List of DFDL string literals 
The property value is a space-separated list of DFDL string literals. When 
parsing, if more than one string literal in the list matches the portion 
of the data stream being evaluated then the longest matching string 
literal in the list must be used.
When unparsing, the first string literal in the list must be used. 
·        DFDL expression 
The property value is an XPath 2.0 expression that calculates evaluates to 
returns a value derived from other property values and/or from the DFDL 
infoset. DFDL expressions can be used to calculate property values, and to 
calculate logical values for simple elements.
·        DFDL regular expression 
The property value is a regular expression that can be used as a pattern 
to calculate the length of an element by comparing applying that pattern 
to the sequence of literal bytes or characters which appear in the data 
stream. 
·        Enum
The property value is a string literal that must be one of the allowed 
values listed in the property description.  
·        QName
The property value is a string literal that also conforms to an XML 
Qualified Name as specified in “Namespaces in XML “ 
·        Simple Type
The property value is a string that describes a logical value. The type of 
the logical value is one of the XML Schema simple types in the DFDL 
allowed subset. 

   1.1.1.1         DFDL Entities in String Literals 
     1.1.1.1        Character classes in DFDL String literals 
     1.1.1.1        Raw byte values in DFDL String Literals 

4. using textStringPadCharacter with charRef '%#r' on multi-byte encoding 
Not discussed
textStringPadCharacter 
DFDL String literal 

The padding character or byte value that is used when justifying or 
trimming text elements. 

A pad character can be specified using DFDL entities. 
A pad byte value must be specified using the %#r entity. 

DFDL validation rules 
- if a pad byte value is specified when lengthUnits='characters' then the 
encoding must be a fixed-width encoding. 
- if a pad character is specified when lengthUnits='bytes' then the pad 
character must be a single-byte character. 

If a pad byte value is specified when lengthUnits='characters' then 
padding and trimming must be applied 
using an array of N pad byte values, where N is the width of a character 
in the fixed-width encoding. 

Annotation: dfdl:element, dfdl:simpleType



Would adding fillByte  to dfdl:padChar enumeration make it clearer. 
padChar is a pad character (%#r not allowed) 
fillByte is a fill byte 

5. nilIndicatorPath and nilIndicatorIndex properties   
These properties seem a bit of an anomaly. Tim has suggested they can be 
simplified. 
Not discussed 


6 Remove timing from dfdl:assert. 
As timing property has been removed from discriminators it is inconsistent 
to still have it on assert. Agreed 
There was also a discussion about adding a message property to 
discriminators which was inconclusive so decided not to change. 

7. Splitting the specification in simpler sections. 
Concern has been raised that a full DFDL implementation is complicated and 
expensive. 
A number of reviewer have said that they might implement only part of the 
sepcification. 
It has been suggested that the WG should look to see if it possible to 
split the specification into more 
manageable optional sections. 
Meeting closed, 17:00 

Next call  Wednesday  07 July  2010  15:00 UK  (10:00 ET) 
Next action: 100 
Actions raised at this meeting 
No
Action 
096
AP: using textStringPadCharacter with charRef '%#r' on multi-byte encoding 

097
nilIndicatorPath and nilIndicatorIndex properties 
099
Splitting the specification in simpler sections. 


Current Actions: 
No
Action 
066
Investigate format for defining test cases 
25/11:IBM to see if it is possible to publish its test case format. 
04/12: no update 
... 
17/02: IBM is willing in principle to publish the test case format and 
some of the test cases. May need some time to build a 'compliance suite' 
24/03: No progress 
03/03: Discussions have been taking place on the subset of tests that will 
be provided. 
10/03: work is progressing 
17/03: work is progressing 
31/03: work is progressing 
14/04: And XML test case format has been defined and is being tested. 
21/04. Schema for TDML defined. Need to define how this and the test cases 
will be made public 
05/05: Work still progressing 
12/05: Work still progressing 
02/06: Work still progressing on technical and legal considerations 
16/06: work continues 
23/06: work continues 
30/06: work continues 
085
ALL: publicize Public comments phase to ensure a good review.. 
14/04: see minutes 
21/04: Press release, OMG and other standards bodies. 
05/05: Alan and Steve H have contacted other standards bodies. Will ask 
them to add comments on spec 
15/05: still no public comments 
02/06: No public comments 
16/06: Public comments period has ended with no external comments. Alan 
had posted changes made in draft 041. Steve suggested send a note to the 
WG highlighting these changes.  Steve also suggested requesting an 
extension as other IBM groups may review. We discussed whether this was 
necessary as changes will need to be made during the implementation phase 
anyway. Alan to ask OGF what the process is for changes post public 
comment. 
23/06: Still no comments. Alan will contact OGF to understand the rest of 
the process. 
30/06: Alan has emailed Joel asking what the process is now public comment 
period is over andcan we update the published version with WG updates. No 
response yet. 
086
AP: Nils and Defaults during unparsing - update table 
31/03: TK to documetn use cases for parsing 
14/04: Investigate new property to control empty string behaviour. 
21/04: After investigation a new property is not required. New rules 
developed and tables updated. 
Need examples of complexTypes to confirm tables apply. 
Review Nils, defaulting spec section. 
05/05: Discussed defaulting complex elements. Tables updated but need to 
add terminator. 
SH; to confirm WMD behaviour when infoset item has no value on unparsing 
Need to describe defaulting choices. 
15/05: More discussion. Alan updating sections 
26/05: Discussed draft updates. Stephanie to confirm asserts do not make 
an element required. 
Alan will update draft..  All: review rest of draft. 
02/06: Alan updated description. Please review.
Discussed Stephanie's example using discriminators. Decided no changes 
needed. 
16/05: went through Steves comments. Steve to update draft. 
23/06: Steve's updates to the rules discussed. See minutes. Rest of 
document needs updating. 
30/06: Discussed Alans updates. Some corrections. Alan will send out 
updated copy for review before next call. 
088
define semantics of choiceKind 'fixedLength' 
31/03: TK to provide definition of calculable length. 
Investigate  PL/I varchars and Cobol occurs dependingon. 
14/04Tim had distributed a document starting the definition of calculable 
length for the longest choice member. 
Alan had done some investigation of COBOL occurs depending on and when 
used in the working section of a program then the maximum storage was 
reserved but when used in the linkage section the dependent number was 
uses. We need to understand how the WMB COBOL importer deals with ODO. 
21/04: Need to define 'calculable length' and WMB importer ODO behaviour. 
05/05: TK: Still need definition of calculable length. 
SKK: WMB COBOL imported behaviour with ODO 
15/05: Suman sent an expmle of an imported Cobol ODo which suggested that 
the maximum space was reserved. He will extend the example. 
02/06: no progress 
16/06: no progress 
23/06: no progress 
30/06: Alan looked at Tim's description of calculable length and suggested 
that that real use case may be much simpler. If real use case is COBOL and 
C importers then it would be cleaner to require the 'fixed length' to be 
specified on the enclosing complex element and remove  choiceKind 'fixed 
length'. Ask Suman is COBOL and C importers can be enhance to provide 
length on cpmplex element. 
092
AP: Confirm behaviour of defaulting with various occursCountKinds and 
separator policies. 
16/06: no progress 
23/06: discussed 
- whether when number of instances doesn't match specified number of 
occurrences is it an error or should missing instances be defaulted? 
Decided it is an error. 
- defaulting occurs up to minoccurs unless separator policy is required 
when default up to maxOccurs and unbounded is an error. 
30/06: Decided that defaulting for variable occurrence arrays should 
always be to minOccurs. If separatorPolicy is 'required' then just the 
separators will be output up to maxOccurs and unbounded is an error. 
096
AP: using textStringPadCharacter with charRef '%#r' on multi-byte encoding 

097
nilIndicatorPath and nilIndicatorIndex properties 
099
Splitting the specification in simpler sections.
Closed actions 
No
Action 
098
 Remove timing from dfdl:assert 


Work items: 
No
Item 
target version 
status 
005
Improvements on property descriptions 

not started 
012
Reordering the properties discussion: move representation earlier, improve 
flow of topics 

not started 
036
Update dfdl schema with change properties 
ongoing 

042
Mapping of the DFDL infoset to XDM 
none 
not required for V1 specification 
070
Write DFDL primer 


071
Write test cases. 


083
Implement RFC2116 


105
AP: Describe trailingSkipBytes for delimited formats. 
Alan suggested 'dfdl:terminator must be specified and not empty if 
dfdl:lengthKind is delimited or endOfParent.' 


106
AP: Skip Bytes should allow bits
Ageed that it should be possible to specify bits.
- LSB and TSB renames to dfdl:leadingSkip, dfdl:trailingSkip
- units are specified by dfdl:alignmentUnits. 


107
Remove timing from dfdl:assert 


108






  
Regards 
  
Alan Powell 
  
Development - MQSeries, Message Broker, ESB 
IBM Software Group, Application and Integration Middleware Software 
------------------------------------------------------------------------------------------------------------------------------------------- 

IBM 
MP211, Hursley Park 
Hursley, SO21 2JN 
United Kingdom 
Phone: +44-1962-815073 
e-mail: alan_powell at uk.ibm.com






Unless stated otherwise above:
IBM United Kingdom Limited - Registered in England and Wales with number 
741598. 
Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU 





--
  dfdl-wg mailing list
  dfdl-wg at ogf.org
  http://www.ogf.org/mailman/listinfo/dfdl-wg







Unless stated otherwise above:
IBM United Kingdom Limited - Registered in England and Wales with number 
741598. 
Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU






-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.ogf.org/pipermail/dfdl-wg/attachments/20100701/5c9090e8/attachment-0001.html 


More information about the dfdl-wg mailing list