[DFDL-WG] Fw: Nulls and Defaults

Alan Powell alan_powell at uk.ibm.com
Thu Apr 3 06:59:34 CDT 2008


I have made the changes discussed yesterday plus the following

added "useNilForDefault is specified" to the definition of  "has default 
value specified? in 13.9
Changed null to nil, nullable to nillable etc except for 
LengthKing=NullTerminated. Agree?


I am concerned about one of the changes

nilIndicatorPath
Path Expression
Path to a logical Boolean field which indicates if this element is null.
For nullKind='nullIndicator'., a path expression referencing another 
element that must be of type Boolean which indicates if this element is 
null. On input, the element value is null if the provided value is true.
When null, on input the element is parsed as normal. If the element length 
is known then the value is skipped otherwise the value must be scannable. 
When null, on output the value is set based on fillByte or padCharacter 
properties and the referenced value set to true.
If non-null then the element is parsed or output normally and the 
referenced value set to false.
Annotation: dfdl:element (all simple types)


By setting the referenced nil indicator we have made it 
impossible/difficult  to implement a streaming unparser. I'm not sure that 
is a good idea.

Also unless we relax the expression rules the indicator bit must be before 
the element.


Please review sections 13.8-13.10




Alan Powell

 MP 211, IBM UK Labs, Hursley,  Winchester, SO21 2JN, England
 Notes Id: Alan Powell/UK/IBM     email: alan_powell at uk.ibm.com 
 Tel: +44 (0)1962 815073                  Fax: +44 (0)1962 816898

----- Forwarded by Alan Powell/UK/IBM on 03/04/2008 12:20 -----

From:
Alan Powell/UK/IBM
To:
Steve Hanson/UK/IBM at IBMGB, "Mike Beckerle" <mbeckerle at OCO-INC.COM>
Date:
01/04/2008 18:35
Subject:
Re: Fw: Nulls and Defaults (was  [DFDL-WG] OGF DFDL WG minutes 2007-12-05 
call)


Steve, Mike

I have finally got around to finishing this off and it turned out to be a 
lot more work than I expected as the default and nulls information as all 
in the wrong places.

Changes

13.8 Properties for Nullable Elements

Updated as requested.
nullKind=xpath changed to nullIndicator as it was xpath is also used in 
nullValue so it was confusing.

13.9    Properties for Default Value Control

Moved from  most of 17.1.1.1 and 17.2 so is now the main description of 
defaults. 

13.10   Nulls, Defaults, and Initiators

Moved from 14.2.1
Updated as requested

17.1.1.1        Repeating and Variable-Occurrence Items and Default Values
Remainder of discussion of variable occurrences.


Outstanding issues

5) Is the list style syntax for dfdl:nullValues acceptable? 
Yes because you can use <dfdl:property name=?nullValues?>?? ?null? 
?NULL?</dfdl:property> 
So what is the syntax and it has to include expressions.

7) Consistent use of nil versus null.   
      => I'm wondering that we should standardise on nil to match xsd ? 
(standardize on nil, not null). 
Does everyone agree to this as it is a significant change to the 
document.?

9)

nullIndicatorPath
Expression
Used when nullKind='nullIndicator'. A path expression referencing another 
element that provides the logical value to compare with nullValues On 
input, the element value is null if the provided value matches in 
nullValues.
When null, If the element is fixed length then it will be skipped on 
input, filled with (TBD: fillbyte?) on output..
Is this correct??? Should it set element to Null? 
When null If the element is variable length with minimum length > 0, then 
a minimum length item will be skipped over, or on output filled (TBD with 
fillbyte?). 
When null If the element is variable length with minimum length 0, then a 
length zero object is expected on input, and a length 0 object will be 
generated on output.
If non-null then the element is parsed or output normally.
Annotation: dfdl:element (all simple types)

10)


useNullValueForDefault
Boolean 
Ignored on input.
IS this correct. Shouldn't it set null if element is required?
On output, if an element is not in the logical model, but it is required, 
the element is nillable, and has dfdl:useNullValueForDefault="true", then 
the logical value is defaulted to null. 
Annotation: dfdl:element (all simple types)


Can you make sure you are happy with the changes.

[attachment "ogf-dfdl-v1.0-Core-032.doc" deleted by Alan Powell/UK/IBM] 


Alan Powell

 MP 211, IBM UK Labs, Hursley,  Winchester, SO21 2JN, England
 Notes Id: Alan Powell/UK/IBM     email: alan_powell at uk.ibm.com 
 Tel: +44 (0)1962 815073                  Fax: +44 (0)1962 816898




From:
Alan Powell/UK/IBM
To:
Steve Hanson/UK/IBM at IBMGB
Cc:
Alan Powell/UK/IBM
Date:
07/02/2008 17:13
Subject:
Re: Fw: Nulls and Defaults (was  [DFDL-WG] OGF DFDL WG minutes 2007-12-05 
call)


Steve

I have done most of this update. See below
Will co,plete in next rev

Alan Powell

 MP 211, IBM UK Labs, Hursley,  Winchester, SO21 2JN, England
 Notes Id: Alan Powell/UK/IBM     email: alan_powell at uk.ibm.com 
 Tel: +44 (0)1962 815073                  Fax: +44 (0)1962 816898





Steve Hanson/UK/IBM 
06/02/2008 09:26

To
Alan Powell/UK/IBM
cc

Subject
Fw: Nulls and Defaults (was  [DFDL-WG] OGF DFDL WG minutes 2007-12-05 
call)





Hi Alan - nulls and defaults changes below.

Regards, Steve

Steve Hanson
WebSphere Message Brokers
Hursley, UK
Internet: smh at uk.ibm.com
Phone (+44)/(0) 1962-815848
----- Forwarded by Steve Hanson/UK/IBM on 06/02/2008 09:25 -----

"Mike Beckerle" <mbeckerle at OCO-INC.COM> 
05/02/2008 21:17

To
Steve Hanson/UK/IBM at IBMGB
cc

Subject
RE: Nulls and Defaults (was  [DFDL-WG] OGF DFDL WG minutes 2007-12-05 
call)






Looks good.
 
From: Steve Hanson [mailto:smh at uk.ibm.com] 
Sent: Tuesday, February 05, 2008 12:22 PM
To: Mike Beckerle
Subject: RE: Nulls and Defaults (was [DFDL-WG] OGF DFDL WG minutes 
2007-12-05 call)
 

Hi Mike 

Looks good, small corrections in blue. With those made we can send to Alan 
I think. 

Regards, Steve

Steve Hanson
WebSphere Message Brokers
Hursley, UK
Internet: smh at uk.ibm.com
Phone (+44)/(0) 1962-815848 


"Mike Beckerle" <mbeckerle at OCO-INC.COM> 
05/02/2008 14:57 


To
Steve Hanson/UK/IBM at IBMGB 
cc

Subject
RE: Nulls and Defaults (was  [DFDL-WG] OGF DFDL WG minutes 2007-12-05 
call)
 









Issues this raises 
1) How can you represent empty string as 
a) a null value? 
b) a default value (not sure you can)? 

1        Proposal: Input Defaulting for Empty Strings 
This is a corner case for strings. If an element is of type string, and 
has a default value specified, it is not clear whether the empty string 
should be an allowed value or if the empty string, when found in the 
representation, should trigger use of the default value instead. 
The following makes this corner case unambiguous: 
Type string with minLength of zero and default value are incompatible. It 
is a schema definition error if a variable length string where zero length 
is valid also has a default value specified.
Not yet added. Wasn't sure where this should go as the Nulls,Default info 
is scattered around
This eliminates complexities around the issue of ?empty? content. Empty 
content always triggers use of the default value. If the type is string 
and empty string is a legal value then there cannot be a default value. 
We also need the same for null values 
Type string with minLength of zero and nillable with empty string as one 
of the dfdl:nullValues are incompatible. It is a schema definition error 
if a variable length string where zero length is valid also is nillable 
and has a null value of empty string specified.
Not yet added. Wasn't sure where this should go as the Nulls,Default info 
is scattered around
2) Why are nullIndicatorPath and nullIndicatorIndex separate properties? 
Convenience. So you can scope the nullIndicatorPath, and have local 
indices. 


3) What does 'missing' mean when initiators are involved? 
      => Covered by extra properties dfdl:nullValueInitiatorPolicy & 
dfdl:defaultValueInitiatorPolicy, as given by tables in 14.2.1.1 and 
14.2.1.2 
      => I think the bottom row of the table in 14.2.1.2 is incorrect - in 
the infoset, empty string and missing element are two distinct cases - how 
do/did we resolve this? 
Changes to this definition: 

defaultValueInitiatorPolicy 
Enum 
Valid values are 'required' or 'prohibited' 
Ignored unless dfdl:initiator is specified and is not "" (empty string). 
Ignored unless the element declaration has a default attribute specified. 
'required' indicates that the dfdl:initiator followed by empty content is 
the required syntax to indicate that a default value will be used. 
'prohibited' indicates that empty content triggers the use of a default 
value, and the presence of an initiator implies that a non-default value 
representation must follow. 
?prohibited? implies an ordered sequence. Use of 
defaultValueInitiatorPolicy=?prohibited? in an initiated element of an 
unordered group is a schema definition error. 
This property applies only on input. (On output, for a required output an 
initiator is always output regardless of the default value.)

Added

1.1.1.1      Initiators and Output 
This table describes the output direction logic for an initiated element 
that is a required element. We assume here that dfdl:initiator is 
specified and not equal to the empty string. 

Logical Value
nullValueInitiatorPolicy
  
useNullValueForDefault
initiator region contains
content region contains
nil 
prohibited 
  
don't care 
  
  
nothing 
representation of nil based on nullKind, nullValues, etc. 
required 
initiator string 
"" (empty string) 
Note that this implies that the element type is xs:string and no default 
value can be specified. 
don't care 
initiator string 
empty string 
a non-nil non-empty-string value 
don't care 
initiator string 
  
The representation of the logical value 
Not supplied 
(element is not nillable) 
Don?t care 
Don?t care 
Initiator string 
The representation of the default value. 
(No default value implies processing error.) 
Not supplied 
(nillable) 
Prohibited 
True 
Nothing 
Representation of nil basd on nullKind, nullValues, etc. 
Required 
Initiator string 
Don?t care 
False 
Initiator String 
The representation of the default value. 
(No default value implies processing error.)
  Added but had trould with table format as couldn't copy/paste.
  

4) What controls null versus default for a missing element on output? 
      => Extra property dfdl:useNullValueForDefault 
See above. 
  

5) Is the list style syntax for dfdl:nullValues acceptable? 
Yes because you can use <dfdl:property name=?nullValues?>?? ?null? 
?NULL?</dfdl:property> 
Which avoids quoting hell. 
(there?s still some issue of list-valued expressions.) 


6) Error cases - need to enumerate these 
      => Input. Required element missing and no default value. 
(processing error) 

      => Output. Required element missing and no default value or null 
value. 
(processing error) 

      => Output. Element is null  and is not nillable. 
(processing error at least. It may be possible for some implementations to 
detect this error sooner.) 
 
      => ? 

7) Consistent use of nil versus null.   
      => I'm wondering that we should standardise on nil to match xsd ? 
(standardize on nil, not null). 


Regards, Steve

Steve Hanson
WebSphere Message Brokers
Hursley, UK
Internet: smh at uk.ibm.com
Phone (+44)/(0) 1962-815848 
----- Forwarded by Steve Hanson/UK/IBM on 04/02/2008 16:06 ----- 
Hi Mike 

In preparation for our discussion on nulls and defaults tomorrow..... 

First of all I'd like to restate what I see as the requirements: 

Uncontentious core properties 
xs:default 
xs:fixed 
dfdl:nullKind 
dfdl:nullValues 
dfdl:nullIndicatorPath 
dfdl:nullIndicatorIndex 

Assumptions 
- 'Required' below is as defined in section 17.1.1.1. 
- The term 'default value' below actually means 'xs:default or xs:fixed' 
- Both default values and null values only apply to simple elements 

Input 
- If a required element is missing from the data stream and it has a 
default value, that will be used as the infoset value of the element 
- If an element is nillable and has a value in the data stream which 
matches one of a list of null values, the infoset value of the element 
will be the special value null 

Output 
- If a required element is missing from the infoset and it has a default 
value, optionally that will be used as the infoset value of the element 
- If a required element is missing from the infoset, optionally the 
special value null  will be used as the infoset value of the element 
- If an element is nillable and has an infoset value null , the value in 
the data stream will be the first of the list of null values

Issues this raises 
1) How can you represent empty string as 
a) a null value? 
b) a default value (not sure you can)? 

2) Why are nullIndicatorPath and nullIndicatorIndex separate properties? 

3) What does 'missing' mean when initiators are involved? 
        => Covered by extra properties dfdl:nullValueInitiatorPolicy & 
dfdl:defaultValueInitiatorPolicy, as given by tables in 14.2.1.1 and 
14.2.1.2 
        => I think the bottom row of the table in 14.2.1.2 is incorrect - 
in the infoset, empty string and missing element are two distinct cases - 
how do/did we resolve this? 

4) What controls null versus default for a missing element on output? 
        => Extra property dfdl:useNullValueForDefault 

5) Is the list style syntax for dfdl:nullValues acceptable? 

6) Error cases - need to enumerate these 
        => Input. Required element missing and no default value. 
        => Output. Required element missing and no default value or null 
value. 
        => Output. Element is null  and is not nillable. 
        => ? 

7) Consistent use of nil versus null.   
        => I'm wondering that we should standardise on nil to match xsd ? 


Regards, Steve

Steve Hanson
WebSphere Message Brokers
Hursley, UK
Internet: smh at uk.ibm.com
Phone (+44)/(0) 1962-815848 

Mike Beckerle <beckerle at us.ibm.com> 
06/12/2007 13:50 
 
 


To
Steve Hanson/UK/IBM at IBMGB 
cc
dfdl-wg at ogf.org, dfdl-wg-bounces at ogf.org 
Subject
Re: [DFDL-WG] OGF DFDL WG minutes 2007-12-05 call


 
 
 









I tend to trust your instincts about things Steve, 

I would summarize it as this: regardless of how people think nulls 
*should* work, in XSD nillables are orthogonal to value and whether or not 
this matches people's past experience we should support it if we're going 
to overload nillable at all. 

To me this reasoning is pretty compelling, so I withdraw my suggestion 
(the "either nillable or default value but not both" idea). 

...mikeb 

Steve Hanson <smh at uk.ibm.com> 
12/06/2007 04:59 AM 
 
 
 


To
Mike Beckerle/Worcester/IBM at IBMUS 
cc
dfdl-wg at ogf.org, dfdl-wg-bounces at ogf.org 
Subject
Re: [DFDL-WG] OGF DFDL WG minutes 2007-12-05 call

 
 
 










Unfortunately I have been roped into something else which will likely 
occupy me full time until middle of next week, so I can't look at the 
defaults/nulls issue in detail right now. But my first reaction to the 
proposal below is that elements should be allowed to have both null and 
default values. They are separate concepts in XML Schema, so why are we 
making the DFDL logical model different?  IMHO subtle differences like 
this cause more issues with customers than the odd extra DFDL property. 
The DFDL subset of XML Schema should be just that - a subset. For those 
features of XML Schema that we do support, the rules should be the same. 

Regards, Steve

Steve Hanson
WebSphere Message Brokers
Hursley, UK
Internet: smh at uk.ibm.com
Phone (+44)/(0) 1962-815848 

Mike Beckerle <beckerle at us.ibm.com> 
Sent by: dfdl-wg-bounces at ogf.org 
05/12/2007 23:21 
 
 
 


To
dfdl-wg at ogf.org 
cc

Subject
[DFDL-WG] OGF DFDL WG minutes 2007-12-05 call
 
 
 











OGF DFDL WG minutes 2007-12-05 call 

Suman Kalia, Simon Parker, Alan Powell, Mike Beckerle 

(who else? - was someone else on also) 

We discussed 

Output issues in the DFDL expression language: 

E.g.., an outputValueCalc for a field in the header of a data stream may 
contain information that requires you to know the rep, or length of the 
rep, of the whole data item. 

We concluded that this kind of thing can't be ruled out. Some formats just 
require buffering and are not streamable; however, implementations can 
vary on just how large a data item they're able to cope with here. 

Expression language section will include a subsection highlighting this 
issue and that implementations can vary here. 

Alan will update his expression language proposal and include this. 

Also suggested was a path length-from-to function that takes 2 path 
expressions and gives you the size of the represntation between them. 
(start of first, to last bit before start of 2nd). 
(I don't think we discussed a clear use case motivating this, but there 
may be one. We did discuss applications trying to fit data into limited 
size boxes, but the use case is not clear. 
Also note that all representation lengths are subject to change due to 
different starting alignments.) 


Nillable and Default: 

We also discussed the interaction of nillable and having a default. 

The sense of the group on the call is that we can restrict these so that 
if something is nillable it cannot also have a default value, and that the 
behavior of DFDL on output for a required element that is nillable but not 
in the logical data, is to create a null value. Everyone agreed that there 
is no need for  a property useNullValueForDefault because this should 
always be the behavior. 

Mike will forward a proposal. 


...mikeb 

Mike Beckerle
STSM, Architect, Scalable Computing
IBM Software Group
Information Platform and Solutions
Westborough, MA 01581
direct: voice and FAX 508-599-7148
assistant: Pam Riordan 
            priordan at us.ibm.com 
            508-599-7046
--
dfdl-wg mailing list
dfdl-wg at ogf.org
http://www.ogf.org/mailman/listinfo/dfdl-wg 

 

Unless stated otherwise above:
IBM United Kingdom Limited - Registered in England and Wales with number 
741598. 
Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU 

 

 



 
Unless stated otherwise above:
IBM United Kingdom Limited - Registered in England and Wales with number 
741598. 
Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU 





 


 
Unless stated otherwise above:
IBM United Kingdom Limited - Registered in England and Wales with number 
741598. 
Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU 








 
Unless stated otherwise above:
IBM United Kingdom Limited - Registered in England and Wales with number 
741598. 
Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU 











Unless stated otherwise above:
IBM United Kingdom Limited - Registered in England and Wales with number 
741598. 
Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU











Unless stated otherwise above:
IBM United Kingdom Limited - Registered in England and Wales with number 
741598. 
Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU





-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.ogf.org/pipermail/dfdl-wg/attachments/20080403/4d377489/attachment-0001.html 
-------------- next part --------------
A non-text attachment was scrubbed...
Name: ogf-dfdl-v1.0-Core-032.2.doc
Type: application/octet-stream
Size: 2365952 bytes
Desc: not available
Url : http://www.ogf.org/pipermail/dfdl-wg/attachments/20080403/4d377489/attachment-0001.obj 


More information about the dfdl-wg mailing list