[DFDL-WG] Fw: Action 162: Re: * DFDL Errata * Proposed spec errata for calendarPattern I and T symbols

Steve Hanson smh at uk.ibm.com
Thu Jan 19 04:53:05 EST 2012


Mike

I don't think it's strictly necessary to specify that we will try the 
longest granularity first, because we can tell from the length of the 
input data, together with the presence/absence of a '.' (fractional second 
separator), which of the granularities we should attempt to parse the 
input data against. Remember that by the time the parser tries to process 
the date/time, it will already have extracted the text using lengthKind 
and done any trimming.

Granularity                             Length
YYYY                                    4
YYYY-MM                                 7
YYYY-MM-DD                              10
YYYY-MM-DDThh:mmTZD             17 or 22
YYYY-MM-DDThh:mm:ssTZD          20 or 25
YYYY-MM-DDThh:mm:ss.sTZD                22 or more but input data will 
contain '.'

(TZD can be Z or +hh:mm or -hh:mm)

Similarly for xs:time we would have,

Granularity                     Length
hh:mmTZD                        6 or 11
hh:mm:ssTZD                     9 or 14
hh:mm:ss.sTZD                   11 or more but input data will contain '.'

Any other length of input data, or invalid data for the granularity 
matched, would be a processing error.

Regards

Steve Hanson
Architect, Data Format Description Language (DFDL)
Co-Chair, OGF DFDL Working Group
IBM SWG, Hursley, UK
smh at uk.ibm.com
tel:+44-1962-815848
----- Forwarded by Steve Hanson/UK/IBM on 18/01/2012 16:29 -----

From:   Mike Beckerle <mbeckerle.dfdl at gmail.com>
To:     Steve Hanson/UK/IBM at IBMGB
Cc:     dfdl-wg at ogf.org
Date:   17/01/2012 19:07
Subject:        Issue 162: Re: [DFDL-WG] Proposed spec errata for 
calendarPattern I and T symbols



I reviewed this. The proposal looks fine as-is.  I don't think the lack of 
gYear and gYearMonth types should affect the functionality offered. We 
left those out not because people don't want formats like that, but 
because we can handle those formats even without the additional types.

Do we need to clarify that the longest granularity is tried first, and 
this is greedy parsing? If the data matches one of the granularities there 
will be no backtracking to try others.



Proposal 

Here is what is proposed to correct this. Note the T symbol is dropped. It 
was introduced to allow xs:dateTime to expect just a time, but that is not 
necessary (it was copied from IBM MRM). I've stated any alternatives that 
we can consider at the end. 


Symbol   Meaning                                 Presentation             
   Example 
I        ISO8601 Date/Time                 (Text)                 
2006-10-07T12:06:56.568+01:00 
IU        ISO8601 Date/Time                 (Text)                 
2006-10-07T12:06:56.568Z 
        with output "Z" if the 
        time zone is +00:00) 
The 'I' symbol must not be used with any other symbol other than the 
'escape for text' symbol. It represents calendar formats that match those 
defined in the restricted profile of the ISO8601 standard proposed by the 
W3C at http://www.w3.org/TR/NOTE-datetime.The formats are referred to as 
'granularities'. 
xs:dateTime. When parsing, the data must match one of the granularities. 
When unparsing, the fullest granularity is used. 
xs:date. When parsing, the data must match one of the date-only 
granularities. When unparsing, the fullest date-only granularity is used. 
'IU' is permitted for xs:date but the 'U' is effectively ignored as there 
is no time zone in the date-only granularities. 
xs:time. When parsing, the data must match only the time components of one 
of the granularities that contains time components. When unparsing, the 
time components of the fullest granularity are used. The literal 'T' 
character is not expected in the data when parsing and is not output when 
unparsing. 
The number of fractional second digits supported is implementation 
dependent but must be at least one.   
For a granularity that omits components, when parsing the values for the 
omitted components are supplied from the Unix epoch 
1970-01-01T00:00:00.000 .

Alternatives: 

As above but don't support the first two granularities ('Year', 'Year and 
month') on the grounds that they are really matching xs:gYear and 
xs:gYearMonth. 


Regards

Steve Hanson
Architect, Data Format Description Language (DFDL)
Co-Chair, OGF DFDL Working Group
IBM SWG, Hursley, UK
smh at uk.ibm.com
tel:+44-1962-815848 





Unless stated otherwise above:
IBM United Kingdom Limited - Registered in England and Wales with number 
741598. 
Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU 







--
 dfdl-wg mailing list
 dfdl-wg at ogf.org
 https://www.ogf.org/mailman/listinfo/dfdl-wg



-- 
Mike Beckerle | OGF DFDL WG Co-Chair 
Tel:  781-330-0412







Unless stated otherwise above:
IBM United Kingdom Limited - Registered in England and Wales with number 
741598. 
Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU





-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.ogf.org/pipermail/dfdl-wg/attachments/20120119/ce3a9cb4/attachment.html>


More information about the dfdl-wg mailing list