[DFDL-WG] Action 284 - agenda item on ICU 'S' symbol

Steve Hanson smh at uk.ibm.com
Tue Aug 25 06:54:00 EDT 2015


Section 13.11.1 to be updated as follows:

S 
fractional second (see note 1) 
Number 
S
SS
SSS
2
23
235

The count of pattern letters determines the format as indicated in the 
table.   <---- moved earlier 

When numeric fields abut one another directly, with no intervening 
delimiter characters, they constitute a run of abutting numeric fields. 
Such runs are parsed specially as described at [ICUDateTime].

Unlike other fields, fractional seconds "S" are padded on the right with 
zero.  <--- moved earlier

Any number of "S" symbols may by specified in the pattern. Implementations 
must accept any number of "S" symbols and must support at least 
millisecond accuracy. When the number of "S" symbols exceeds the supported 
accuracy, excess fractional seconds are truncated from the right (not 
rounded) when parsing, and zeros are added to the right when unparsing. 
For example, for xs:time with dfdl:calendarPattern "ss.SSSS" and 
millisecond accuracy, parsing data "12.3456" creates infoset value 
"00:00:12:345", which when unparsing creates data "12.3450".

Regards
 
Steve Hanson
Architect, IBM DFDL
Co-Chair, OGF DFDL Working Group
IBM SWG, Hursley, UK
smh at uk.ibm.com
tel:+44-1962-815848



From:   Steve Hanson/UK/IBM
To:     Andrew Edwards/UK/IBM at IBMGB
Cc:     DFDL-WG <dfdl-wg at ogf.org>
Date:   12/08/2015 09:06
Subject:        Re: [DFDL-WG] OGF DFDL WG Call Agenda 2015-08-11 - agenda 
item on ICU 'S' symbol


Andy - yes that is the behaviour I am seeing. 

Regards
 
Steve Hanson
Architect, IBM DFDL
Co-Chair, OGF DFDL Working Group
IBM SWG, Hursley, UK
smh at uk.ibm.com
tel:+44-1962-815848




From:   Andrew Edwards/UK/IBM
To:     Steve Hanson/UK/IBM at IBMGB
Cc:     DFDL-WG <dfdl-wg at ogf.org>
Date:   11/08/2015 17:28
Subject:        Re: [DFDL-WG] OGF DFDL WG Call Agenda 2015-08-11 - agenda 
item on ICU 'S' symbol


Following on from today's call, the relevant piece of documentation is in 
http://icu-project.org/apiref/icu4c/classicu_1_1SimpleDateFormat.html

When numeric fields abut one another directly, with no intervening 
delimiter characters, they constitute a run of abutting numeric fields. 
Such runs are parsed specially. For example, the format "HHmmss" parses 
the input text "123456" to 12:34:56, parses the input text "12345" to 
1:23:45, and fails to parse "1234". In other words, the leftmost field of 
the run is flexible, while the others keep a fixed width. If the parse 
fails anywhere in the run, then the leftmost field is shortened by one 
character, and the entire run is parsed again. This is repeated until 
either the parse succeeds or the leftmost field is one character in 
length. If the parse still fails at that point, the parse of the run 
fails. 

So it seems that when the 'S' is next to other numeric units in the 
pattern, it will be subject to the above behaviour.  Therefore:
 - A pattern of HHmmssSSS with the input 112233123 will become 
11:22:33.123 but the input 1122331234 will trigger an error.
 - If the pattern includes a '.' to become HHmmss.SSS, I think the input 
1122331234 will become 11:22:33.123 but I'll try and confirm.

Steve - Does that description match what you were seeing?

HTH,
Andy 
Andy Edwards - IBM Integration Bus - DFDL

Email:
andy.edwards at uk.ibm.com
Snail Mail: 
MP211, Hursley park, Hursley, WINCHESTER, Hants, SO21 2JN
Tel int:
247222
Tel ext:
+44 (0)1962 817222
Desk:
DE3 V17

The Feynman problem solving Algorithm
  1) Write down the problem
  2) Think real hard
  3) Write down the answer
 -- Murray Gell-mann in the NY Times





From:   Steve Hanson/UK/IBM
To:     Andrew Edwards/UK/IBM at IBMGB
Cc:     DFDL-WG <dfdl-wg at ogf.org>
Date:   11/08/2015 12:53
Subject:        Re: [DFDL-WG] OGF DFDL WG Call Agenda 2015-08-11 - agenda 
item on ICU 'S' symbol


Hi Andy

Your internal ticket #630 gave rise to external ticket 
http://bugs.icu-project.org/trac/ticket/10962, which claims to have fixed 
the API docs to clarify the behaviour.

S
fractional second - truncates (like other time fields) 
to the count of letters when formatting. Appends 
zeros if more than 3 letters specified. Truncates at 
three significant digits when parsing. 
S
SS
SSS
SSSS
2
23
235
2350

I can't see anywhere that addresses your point about about abutting versus 
non-abutting numeric symbols though?

As far as DFDL spec is concerned, this is what we say today:

S 
fractional second (see note 1) 
Number 
S
SS
SSS
2
24
235

There is no 'note 1', I think the note was made into a normal paragraph, 
which reads:

Any number of fractional seconds "S" may by specified in the pattern and 
accepted by implementations, but an implementation is free to represent a 
limited number of fractional seconds internally. Excess fractional seconds 
are truncated, not rounded up. At least millisecond accuracy must be 
implemented. Unlike other fields, fractional seconds are padded on the 
right with zero.

Regards
 
Steve Hanson
Architect, IBM DFDL
Co-Chair, OGF DFDL Working Group
IBM SWG, Hursley, UK
smh at uk.ibm.com
tel:+44-1962-815848




From:   Andrew Edwards/UK/IBM
To:     Steve Hanson/UK/IBM at IBMGB
Date:   11/08/2015 11:56
Subject:        Re: [DFDL-WG] OGF DFDL WG Call Agenda 2015-08-11


Hi Steve

Re agenda item 2 and calendar patterns with 'S', this ICU ticket from last 
year might be relevant - https://icu.sanjose.ibm.com/gcoctrac/ticket/630. 
It seems that the error reporting may also depend on whether the pattern 
has 'S' on it's own or next to other numeric pattern entities.  i.e. 
'HHmmssS' is subject to length checking, but 'HHmmss S' is not, due to the 
space before the 'S'.

HTH,
Andy 
Andy Edwards - IBM Integration Bus - DFDL

Email:
andy.edwards at uk.ibm.com
Snail Mail: 
MP211, Hursley park, Hursley, WINCHESTER, Hants, SO21 2JN
Tel int:
247222
Tel ext:
+44 (0)1962 817222
Desk:
DE3 V17

The Feynman problem solving Algorithm
  1) Write down the problem
  2) Think real hard
  3) Write down the answer
 -- Murray Gell-mann in the NY Times





From:   Steve Hanson/UK/IBM at IBMGB
To:     dfdl-wg at ogf.org
Cc:     Mike Beckerle <mbeckerle at tresys.com>, jorge.marizan at gmail.com
Date:   10/08/2015 18:29
Subject:        [DFDL-WG] OGF DFDL WG Call Agenda 2015-08-11
Sent by:        dfdl-wg-bounces at ogf.org



Please find agenda for call on Redmine at 
https://redmine.ogf.org/dmsf_files/13489?download= 

Regards

Steve Hanson
Architect, IBM Data Format Description Language (DFDL)
Co-Chair, OGF DFDL Working Group
IBM SWG, Hursley, UK
smh at uk.ibm.com
tel:+44-1962-815848 
--
  dfdl-wg mailing list
  dfdl-wg at ogf.org
  https://www.ogf.org/mailman/listinfo/dfdl-wg

Unless stated otherwise above:
IBM United Kingdom Limited - Registered in England and Wales with number 
741598. 
Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU

Unless stated otherwise above:
IBM United Kingdom Limited - Registered in England and Wales with number 
741598. 
Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.ogf.org/pipermail/dfdl-wg/attachments/20150825/88b69ba1/attachment-0001.html>


More information about the dfdl-wg mailing list