[DFDL-WG] Minutes for OGF DFDL Working Group Call, July 14-2010

Tim Kimber KIMBERT at uk.ibm.com
Thu Jul 15 15:02:13 CDT 2010


A small correction to the wording of the textStringPadCharacter property 
description, to accurately reflect what was discussed in the meeting
If I recall correctly, the following wording was suggested:

DFDL String literal

The value that is used when justifying or trimming text elements. The 
value can be a single character or a single byte.
If a character, then it can be specified using a literal character or 
using DFDL entities. 
If a byte, then it must be specified using a single %#r entity

If a pad character is specified when lengthUnits='bytes' then the pad 
character must be a single-byte character. 
If a pad byte is specified when lengthUnits='characters' then 
- the encoding must be a fixed-width encoding
- padding and trimming must be applied using a sequence of N pad bytes, 
where N is the width of a character in the fixed-width encoding. 

regards,

Tim Kimber, Common Transformation Team,
Hursley, UK
Internet:  kimbert at uk.ibm.com
Tel. 01962-816742 
Internal tel. 246742




From:
Alan Powell/UK/IBM at IBMGB
To:
dfdl-wg at ogf.org
Date:
15/07/2010 17:45
Subject:
[DFDL-WG] Minutes for OGF DFDL Working Group Call, July 14-2010





Open Grid Forum: Data Format Description Language Working Group

OGF DFDL Working Group Call, July 14-2010

Attendees 
Steve Hanson (IBM) 
Alan Powell (IBM) 
Stephanie Fetzer (IBM) 
Tim Kimber(IBM) 
Suman Kalia (IBM) 

Apologies 
Mike Beckerle (Oco) 

1. Current Actions 
Updated below 

2 Nils and Defaults. 
Discussed Alan updates (v9). Updates still needed.

3. using textStringPadCharacter with charRef '%#r' on multi-byte encoding 
textStringPadCharacter 
DFDL String literal 

The padding character or byte value that is used when justifying or 
trimming text elements. 

A pad character can be specified using a literal or DFDL entities. 
A pad byte value must be specified using the %#r entity. 

DFDL validation rules 
- if a pad character is specified when lengthUnits='bytes' then the pad 
character must be a single-byte character. 
- if a pad byte value is specified when lengthUnits='characters' then the 
encoding must be a fixed-width encoding. Padding and trimming is be 
applied using an array of N pad bytes values, where N is the width of a 
character in the fixed-width encoding. 

Annotation: dfdl:element, dfdl:simpleType

Would adding fillByte  to dfdl:textPadKind enumeration make it clearer. 
padChar is a pad character (%#r not allowed) 
fillByte is a fill byte 

Although there was some merit in having separate textPadKinds it was felt 
it was too late to make the change. The property descriptions (for 
numbers, calendars and booleans as well) will be improved as above. 


4. nilIndicatorPath and nilIndicatorIndex properties   
These properties seem a bit of an anomaly. Tim has suggested they can be 
simplified. 
Tim had emaled Mike with a proposal and Steve had responded with 3 
alternatives 
1) Leave things as they are, perhaps renaming 'nilIndicator' to 
'indicator'

2) Collapse nilIndicatorPath and nilIndicatorIndex to a single property. 
It means the path is repeated, but it makes nils consistent with length 
and occurs. For all, you can always use a string variable to hold the 
constant part of the path and concatenate with the index. And if we 
improve usability in a future DFDL release, we would improve it for nils, 
lengths and occurs at the same time.

3) Drop the properties for 1.0 altogether on the grounds that it is a rare 
case. I've never seen such a format, but we know Mike has. 
As part of nils default discussion the WG decided to drop 
dfdl:nilIndicator and dfdl:nilIndicatorIndex as it was felt the 
complications it introduced on unparsing were not worth the additional 
function. This may be revisited for DFDL v2. 

5. Alignment enumerations when dfdl:lengthUnits = 'bits' 
When alignmentUnits='bits', the rule that alignment must be 1,2,4,8 etc or 
multiple of 2 looks rather hard to justify. 
We should either 
a) remove the rule entirely and allow any postiive integer > 1 or 
b) make the rule apply only when alignmentUnits='bytes' 
It was agreed to remove the restriction on aligmentUnits for both 'bytes' 
and 'bits'. 
Also suggested that a note should be added to the default alignment and 
implicit length tables that 'although the values are specified in bits 
that does not imply that the dfdl:lengthUnits can be specified as  'bits' 
for all the simple types 


6. Semantics of 'fixed' 
Not discussed. Action Raised. 

7. Clarify the specification of error reporting from a DFDL processor 
- section 2.3 needs to be updated 
Not discussed. Action Raised

8 Asserts and discriminators 
- specify the scope of forward references. Must be downward-only. The 
expression must be resolvable by the time the component on which it is 
positioned goes out of scope - otherwise it is a processing error. 
Not discussed. Action Raised

9 Expressions 
Discuss error behaviour when evaluating an expression in various contexts 
- All properties: 
wrong type returned : schema definition error 
exception when evaluating expression : schema definition error 
referenced variables/paths not available : schema definition error 
- Properties which allow a forward reference 
referenced variables/paths not available : no error. DFDL processor 
continues processing until the expression result is available, then acts 
on the result. 
Not discussed. Action Raised

10. Grammar for DFDL expression 
- currently allows multiple predicates on a single expression. Sounds as 
if this was unintended. 
Not discussed. Action Raised 
Meeting closed, 17:00 

Next call  Wednesday  21 July  2010  15:00 UK  (10:00 ET) 
Next action: 106 
Actions raised at this meeting 
No
Action 
101
Semantics of 'fixed' 
102
Clarify the specification of error reporting from a DFDL processor 
- section 2.3 needs to be updated 
103
Asserts and discriminators 
- specify the scope of forward references. Must be downward-only. The 
expression must be resolvable by the time the component on which it is 
positioned goes out of scope - otherwise it is a processing error. 
104
Expressions 
Discuss error behaviour when evaluating an expression in various contexts 
- All properties: 
wrong type returned : schema definition error 
exception when evaluating expression : schema definition error 
referenced variables/paths not available : schema definition error 
- Properties which allow a forward reference 
referenced variables/paths not available : no error. DFDL processor 
continues processing until the expression result is available, then acts 
on the result. 
105
Grammar for DFDL expression 
- currently allows multiple predicates on a single expression. Sounds as 
if this was unintended. 
Current Actions: 
No
Action 
066
Investigate format for defining test cases 
25/11:IBM to see if it is possible to publish its test case format. 
04/12: no update 
... 
17/02: IBM is willing in principle to publish the test case format and 
some of the test cases. May need some time to build a 'compliance suite' 
24/03: No progress 
03/03: Discussions have been taking place on the subset of tests that will 
be provided. 
10/03: work is progressing 
17/03: work is progressing 
31/03: work is progressing 
14/04: And XML test case format has been defined and is being tested. 
21/04. Schema for TDML defined. Need to define how this and the test cases 
will be made public 
05/05: Work still progressing 
12/05: Work still progressing 
02/06: Work still progressing on technical and legal considerations 
... 
14/07: work continues 
085
ALL: publicize Public comments phase to ensure a good review.. 
14/04: see minutes 
21/04: Press release, OMG and other standards bodies. 
05/05: Alan and Steve H have contacted other standards bodies. Will ask 
them to add comments on spec 
15/05: still no public comments 
02/06: No public comments 
16/06: Public comments period has ended with no external comments. Alan 
had posted changes made in draft 041. Steve suggested send a note to the 
WG highlighting these changes.  Steve also suggested requesting an 
extension as other IBM groups may review. We discussed whether this was 
necessary as changes will need to be made during the implementation phase 
anyway. Alan to ask OGF what the process is for changes post public 
comment. 
23/06: Still no comments. Alan will contact OGF to understand the rest of 
the process. 
30/06: Alan has emailed Joel asking what the process is now public comment 
period is over andcan we update the published version with WG updates. No 
response yet. 
07/07: No response. Alan will chase up 
14/07: No response from Joel. Sent email to Greg Newby by no response. 
086
AP: Nils and Defaults during unparsing - update table 
31/03: TK to documetn use cases for parsing 
14/04: Investigate new property to control empty string behaviour. 
21/04: After investigation a new property is not required. New rules 
developed and tables updated. 
Need examples of complexTypes to confirm tables apply. 
Review Nils, defaulting spec section. 
05/05: Discussed defaulting complex elements. Tables updated but need to 
add terminator. 
SH; to confirm WMD behaviour when infoset item has no value on unparsing 
Need to describe defaulting choices. 
15/05: More discussion. Alan updating sections 
26/05: Discussed draft updates. Stephanie to confirm asserts do not make 
an element required. 
Alan will update draft..  All: review rest of draft. 
02/06: Alan updated description. Please review.
Discussed Stephanie's example using discriminators. Decided no changes 
needed. 
16/05: went through Steves comments. Steve to update draft. 
23/06: Steve's updates to the rules discussed. See minutes. Rest of 
document needs updating. 
30/06: Discussed Alans updates. Some corrections. Alan will send out 
updated copy for review before next call. 
07/07: Discussed Alan updates and Tim and Steve's comments. Still some 
corrections and updates. 
14/07: Discussed Alan updates (v9) Still some corrections and updates. 
099
Splitting the specification in simpler sections. 
07/07: Steve sent a proposal but not discussed. Alan will arrange a 
separate call. 
14/07:Discussed Steve's proposal and Suman's and Alan's comments. 
Need to add choice, validation, facets. 
Also how does an implementation declare which subsets it supports. 
Suggested levels and/or profiles. Steve highlighted a problem when a DFDL 
schema from an implementation of just the core functions was moved to a 
full DFDL implementation what should happen about the missing properties. 
Does the full implementation need to be aware of subsets of functions? 
Should it raise a schema definition error for use of a function not in the 
subset. 
101
Semantics of 'fixed' 
102
Clarify the specification of error reporting from a DFDL processor 
- section 2.3 needs to be updated 
103
Asserts and discriminators 
- specify the scope of forward references. Must be downward-only. The 
expression must be resolvable by the time the component on which it is 
positioned goes out of scope - otherwise it is a processing error. 
104
Expressions 
Discuss error behaviour when evaluating an expression in various contexts 
- All properties: 
wrong type returned : schema definition error 
exception when evaluating expression : schema definition error 
referenced variables/paths not available : schema definition error 
- Properties which allow a forward reference 
referenced variables/paths not available : no error. DFDL processor 
continues processing until the expression result is available, then acts 
on the result. 
105
Grammar for DFDL expression 
- currently allows multiple predicates on a single expression. Sounds as 
if this was unintended. 
Closed actions 
No
Action 
088
define semantics of choiceKind 'fixedLength' 
31/03: TK to provide definition of calculable length. 
Investigate  PL/I varchars and Cobol occurs dependingon. 
14/04Tim had distributed a document starting the definition of calculable 
length for the longest choice member. 
Alan had done some investigation of COBOL occurs depending on and when 
used in the working section of a program then the maximum storage was 
reserved but when used in the linkage section the dependent number was 
uses. We need to understand how the WMB COBOL importer deals with ODO. 
21/04: Need to define 'calculable length' and WMB importer ODO behaviour. 
05/05: TK: Still need definition of calculable length. 
SKK: WMB COBOL imported behaviour with ODO 
15/05: Suman sent an expmle of an imported Cobol ODo which suggested that 
the maximum space was reserved. He will extend the example. 
02/06: no progress 
16/06: no progress 
23/06: no progress 
30/06: Alan looked at Tim's description of calculable length and suggested 
that that real use case may be much simpler. If real use case is COBOL and 
C importers then it would be cleaner to require the 'fixed length' to be 
specified on the enclosing complex element and remove  choiceKind 'fixed 
length'. Ask Suman is COBOL and C importers can be enhance to provide 
length on cpmplex element. 
07/07: Steve had proposed 5 possible approaches to supporting main use 
case. 
i) Change the importer to add a parent element for each group. Con: This 
changes the logical model by inserting an extra level. 
ii) Allow dfdl:length to be carried on embedded xs:choice (along with 
dfdl:lengthUnits & dfdl:lengthKind). 
iii)  New dfdl:choiceLength property to be carried on embedded xs:choice 
(always bytes). 
iv) dfdl:choiceKind as today. Con: length from TD model is lost and must 
be recalculated. 
v) dfdl:choiceKind but with limitations to make the calculation easier. 
Con: length from TD model is lost and must be recalculated. 
Agreed a modified version of iii) 
dfdl:choiceKind becomes dfdl:choiceLengthKind with emuns 'implicit' 
(length of selected branch) and 'explicit' ( length specified by 
dfd;choiceLength) 
New property dfdl:choiceLength 
14/07: Closed 
096
AP: using textStringPadCharacter with charRef '%#r' on multi-byte encoding 

14/07: Although there was some merit in having separate textPadKinds it 
was felt it was too late to make the change. The property descriptions 
(for numbers, calendars and booleans as well) will be improved. Closed 
097
nilIndicatorPath and nilIndicatorIndex properties 
07/07: Tim had emaled Mike with a proposal and Steve had responded with 3 
alternatives 
1) Leave things as they are, perhaps renaming 'nilIndicator' to 
'indicator'

2) Collapse nilIndicatorPath and nilIndicatorIndex to a single property. 
It means the path is repeated, but it makes nils consistent with length 
and occurs. For all, you can always use a string variable to hold the 
constant part of the path and concatenate with the index. And if we 
improve usability in a future DFDL release, we would improve it for nils, 
lengths and occurs at the same time.

3) Drop the properties for 1.0 altogether on the grounds that it is a rare 
case. I've never seen such a format, but we know Mike has. 
Not discussed. 
14/07: As part of nils default discussion the WG decided to drop 
dfdl:nilIndicator and dfdl:nilIndicatorIndex as it was felt the 
complications it introduced on unparsing were not worth the additional 
function. This may be revisited for DFDL v2.. Closed 
100
Alignment enumerations when dfdl:lengthUnits = 'bits' 
When alignmentUnits='bits', the rule that alignment must be 1,2,4,8 etc or 
multiple of 2 looks rather hard to justify. 
It was agreed to remove the restriction on aligmentUnits for both 'bytes' 
and 'bits'. Closed
Work items: 
No
Item 
target version 
status 
005
Improvements on property descriptions 

not started 
012
Reordering the properties discussion: move representation earlier, improve 
flow of topics 

not started 
036
Update dfdl schema with change properties 
ongoing 

042
Mapping of the DFDL infoset to XDM 
none 
not required for V1 specification 
070
Write DFDL primer 


071
Write test cases. 


083
Implement RFC2116 


105
AP: Describe trailingSkipBytes for delimited formats. 
Alan suggested 'dfdl:terminator must be specified and not empty if 
dfdl:lengthKind is delimited or endOfParent.' 


106
AP: Skip Bytes should allow bits
Ageed that it should be possible to specify bits.
- LSB and TSB renames to dfdl:leadingSkip, dfdl:trailingSkip
- units are specified by dfdl:alignmentUnits. 


107
Remove timing from dfdl:assert 


108
AP: Confirm behaviour of defaulting with various occursCountKinds and 
separator policies. 
30/06: Decided that defaulting for variable occurrence arrays should 
always be to minOccurs. If separatorPolicy is 'required' then just the 
separators will be output up to maxOccurs and unbounded is an error. 


109
define semantics of choiceKind 'fixedLength' 
dfdl:choiceKind becomes dfdl:choiceLengthKind with emuns 'implicit' 
(length of selected branch) and 'explicit' ( length specified by 
dfd;choiceLength) 
New property dfdl:choiceLength 


110 
nilIndicatorPath and nilIndicatorIndex properties 
14/07: As part of nils default discussion the WG decided to drop 
dfdl:nilIndicator and dfdl:nilIndicatorIndex 


111
AP: using textStringPadCharacter with charRef '%#r' on multi-byte encoding 

14/07: Although there was some merit in having separate textPadKinds it 
was felt it was too late to make the change. The property descriptions 
(for numbers, calendars and booleans as well) will be improved. Closed 


112
Alignment enumerations when dfdl:lengthUnits = 'bits' 
When alignmentUnits='bits', the rule that alignment must be 1,2,4,8 etc or 
multiple of 2 looks rather hard to justify. 
It was agreed to remove the restriction on aligmentUnits for both 'bytes' 
and 'bits'. Closed 













  
Regards 
  
Alan Powell 
  
Development - MQSeries, Message Broker, ESB 
IBM Software Group, Application and Integration Middleware Software 
------------------------------------------------------------------------------------------------------------------------------------------- 

IBM 
MP211, Hursley Park 
Hursley, SO21 2JN 
United Kingdom 
Phone: +44-1962-815073 
e-mail: alan_powell at uk.ibm.com






Unless stated otherwise above:
IBM United Kingdom Limited - Registered in England and Wales with number 
741598. 
Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU 





--
  dfdl-wg mailing list
  dfdl-wg at ogf.org
  http://www.ogf.org/mailman/listinfo/dfdl-wg







Unless stated otherwise above:
IBM United Kingdom Limited - Registered in England and Wales with number 
741598. 
Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU





-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.ogf.org/pipermail/dfdl-wg/attachments/20100715/10874fc0/attachment-0001.html 


More information about the dfdl-wg mailing list