[DFDL-WG] Action 284 - agenda item on ICU 'S' symbol

Mike Beckerle mbeckerle.dfdl at gmail.com
Tue Aug 25 09:10:08 EDT 2015


Do we really want to allow "any number of S" ?

Quantum mechanics based on plank's constant and the speed of light, the
smallest unit of time is about 3.3x10-44 seconds, so there's never going to
be a need for more than 45 S's in this universe, at least until time-travel
is discovered. (http://www.physlink.com/Education/AskExperts/ae598.cfm)

This is a place where an "implementation specific maximum" makes sense to
me, though I'd be happy to put a floor under it like not less than 6 "S".


Mike Beckerle | OGF DFDL Workgroup Co-Chair | Tresys Technology |
www.tresys.com
Please note: Contributions to the DFDL Workgroup's email discussions are
subject to the OGF Intellectual Property Policy
<http://www.ogf.org/About/abt_policies.php>


On Tue, Aug 25, 2015 at 6:54 AM, Steve Hanson <smh at uk.ibm.com> wrote:

> Section 13.11.1 to be updated as follows:
>
> S         fractional second (see note 1) Number         S
>
> SS
>
> SSS
> 2
>
> 23
>
> 235
>
>
>
> The count of pattern letters determines the format as indicated in the
> table.   <---- moved earlier
>
> When numeric fields abut one another directly, with no intervening
> delimiter characters, they constitute a run of abutting numeric fields.
> Such runs are parsed specially as described at [*ICUDateTime*].
>
> Unlike other fields, fractional seconds "S" are padded on the right with
> zero.  <--- moved earlier
>
> Any number of "S" symbols may by specified in the pattern. Implementations
> must accept any number of "S" symbols and must support at least millisecond
> accuracy. When the number of "S" symbols exceeds the supported accuracy, excess
> fractional seconds are truncated from the right (not rounded) when
> parsing, and zeros are added to the right when unparsing. For example, for
> xs:time with dfdl:calendarPattern "ss.SSSS" and millisecond accuracy,
> parsing data "12.3456" creates infoset value "00:00:12:345", which when
> unparsing creates data "12.3450".
>
> Regards
>
> Steve Hanson
> Architect, *IBM DFDL*
> <http://www.ibm.com/developerworks/library/se-dfdl/index.html>
> Co-Chair, *OGF DFDL Working Group* <http://www.ogf.org/dfdl/>
> IBM SWG, Hursley, UK
> *smh at uk.ibm.com* <smh at uk.ibm.com>
> tel:+44-1962-815848
>
>
>
> From:        Steve Hanson/UK/IBM
> To:        Andrew Edwards/UK/IBM at IBMGB
> Cc:        DFDL-WG <dfdl-wg at ogf.org>
> Date:        12/08/2015 09:06
> Subject:        Re: [DFDL-WG] OGF DFDL WG Call Agenda 2015-08-11 - agenda
> item on ICU 'S' symbol
> ------------------------------
>
>
> Andy - yes that is the behaviour I am seeing.
>
> Regards
>
> Steve Hanson
> Architect, *IBM DFDL*
> <http://www.ibm.com/developerworks/library/se-dfdl/index.html>
> Co-Chair, *OGF DFDL Working Group* <http://www.ogf.org/dfdl/>
> IBM SWG, Hursley, UK
> *smh at uk.ibm.com* <smh at uk.ibm.com>
> tel:+44-1962-815848
>
>
>
>
> From:        Andrew Edwards/UK/IBM
> To:        Steve Hanson/UK/IBM at IBMGB
> Cc:        DFDL-WG <dfdl-wg at ogf.org>
> Date:        11/08/2015 17:28
> Subject:        Re: [DFDL-WG] OGF DFDL WG Call Agenda 2015-08-11 - agenda
> item on ICU 'S' symbol
> ------------------------------
>
>
> Following on from today's call, the relevant piece of documentation is in
> http://icu-project.org/apiref/icu4c/classicu_1_1SimpleDateFormat.html
>
> *When numeric fields abut one another directly, with no intervening
> delimiter characters, they constitute a run of abutting numeric fields.
> Such runs are parsed specially. For example, the format "HHmmss" parses the
> input text "123456" to 12:34:56, parses the input text "12345" to 1:23:45,
> and fails to parse "1234". In other words, the leftmost field of the run is
> flexible, while the others keep a fixed width. If the parse fails anywhere
> in the run, then the leftmost field is shortened by one character, and the
> entire run is parsed again. This is repeated until either the parse
> succeeds or the leftmost field is one character in length. If the parse
> still fails at that point, the parse of the run fails. *
>
> So it seems that when the 'S' is next to other numeric units in the
> pattern, it will be subject to the above behaviour.  Therefore:
>  - A pattern of HHmmssSSS with the input 112233123 will become
> 11:22:33.123 but the input 1122331234 will trigger an error.
>  - If the pattern includes a '.' to become HHmmss.SSS, I think the input
> 1122331234 will become 11:22:33.123 but I'll try and confirm.
>
> Steve - Does that description match what you were seeing?
>
> HTH,
> Andy *Andy Edwards* - *IBM Integration Bus*
> <http://www-03.ibm.com/software/products/us/en/integration-bus> - *DFDL*
> <https://w3-connections.ibm.com/wikis/home?lang=en-gb#!/wiki/IBM%20Data%20Format%20Description%20Language>
> *Email:* *andy.edwards at uk.ibm.com* <andy.edwards at uk.ibm.com> *Snail Mail:*
>   MP211, Hursley park, Hursley, WINCHESTER, Hants, SO21 2JN *Tel int:*
> 247222 *Tel ext:* +44 (0)1962 817222 *Desk:* DE3 V17
> *The Feynman problem solving Algorithm*
>  1) Write down the problem
>  2) Think real hard
>  3) Write down the answer
> -- Murray Gell-mann in the NY Times
>
>
>
>
>
> From:        Steve Hanson/UK/IBM
> To:        Andrew Edwards/UK/IBM at IBMGB
> Cc:        DFDL-WG <dfdl-wg at ogf.org>
> Date:        11/08/2015 12:53
> Subject:        Re: [DFDL-WG] OGF DFDL WG Call Agenda 2015-08-11 - agenda
> item on ICU 'S' symbol
> ------------------------------
>
>
> Hi Andy
>
> Your internal ticket #630 gave rise to external ticket
> http://bugs.icu-project.org/trac/ticket/10962, which claims to have fixed
> the API docs to clarify the behaviour.
> S fractional second - truncates (like other time fields)
> to the count of letters when formatting. Appends
> zeros if more than 3 letters specified. Truncates at
> three significant digits when parsing.  S
> SS
> SSS
> SSSS 2
> 23
> 235
> 2350
>
> I can't see anywhere that addresses your point about about abutting versus
> non-abutting numeric symbols though?
>
> As far as DFDL spec is concerned, this is what we say today:
>
> S         fractional second (see note 1) Number         S
>
> SS
>
> SSS
> 2
>
> 24
>
> 235
>
>
>
> There is no 'note 1', I think the note was made into a normal paragraph,
> which reads:
>
> *Any number of fractional seconds "S" may by specified in the pattern and
> accepted by implementations, but an implementation is free to represent a
> limited number of fractional seconds internally. Excess fractional seconds
> are truncated, not rounded up. At least millisecond accuracy must be
> implemented. Unlike other fields, fractional seconds are padded on the
> right with zero.*
>
> Regards
>
> Steve Hanson
> Architect, *IBM DFDL*
> <http://www.ibm.com/developerworks/library/se-dfdl/index.html>
> Co-Chair, *OGF DFDL Working Group* <http://www.ogf.org/dfdl/>
> IBM SWG, Hursley, UK
> *smh at uk.ibm.com* <smh at uk.ibm.com>
> tel:+44-1962-815848
>
>
>
>
> From:        Andrew Edwards/UK/IBM
> To:        Steve Hanson/UK/IBM at IBMGB
> Date:        11/08/2015 11:56
> Subject:        Re: [DFDL-WG] OGF DFDL WG Call Agenda 2015-08-11
> ------------------------------
>
>
> Hi Steve
>
> Re agenda item 2 and calendar patterns with 'S', this ICU ticket from last
> year might be relevant - https://icu.sanjose.ibm.com/gcoctrac/ticket/630.
>  It seems that the error reporting may also depend on whether the pattern
> has 'S' on it's own or next to other numeric pattern entities.  i.e.
> 'HHmmssS' is subject to length checking, but 'HHmmss S' is not, due to the
> space before the 'S'.
>
> HTH,
> Andy *Andy Edwards* - *IBM Integration Bus*
> <http://www-03.ibm.com/software/products/us/en/integration-bus> - *DFDL*
> <https://w3-connections.ibm.com/wikis/home?lang=en-gb#!/wiki/IBM%20Data%20Format%20Description%20Language>
> *Email:* *andy.edwards at uk.ibm.com* <andy.edwards at uk.ibm.com> *Snail Mail:*
>   MP211, Hursley park, Hursley, WINCHESTER, Hants, SO21 2JN *Tel int:*
> 247222 *Tel ext:* +44 (0)1962 817222 *Desk:* DE3 V17
> *The Feynman problem solving Algorithm*
>  1) Write down the problem
>  2) Think real hard
>  3) Write down the answer
> -- Murray Gell-mann in the NY Times
>
>
>
>
>
> From:        Steve Hanson/UK/IBM at IBMGB
> To:        dfdl-wg at ogf.org
> Cc:        Mike Beckerle <mbeckerle at tresys.com>, jorge.marizan at gmail.com
> Date:        10/08/2015 18:29
> Subject:        [DFDL-WG] OGF DFDL WG Call Agenda 2015-08-11
> Sent by:        dfdl-wg-bounces at ogf.org
> ------------------------------
>
>
>
> Please find agenda for call on Redmine at
> *https://redmine.ogf.org/dmsf_files/13489?download=*
> <https://redmine.ogf.org/dmsf_files/13489?download=>
>
> Regards
>
> Steve Hanson
> Architect, IBM Data Format Description Language (DFDL)
> Co-Chair, *OGF DFDL Working Group* <http://www.ogf.org/dfdl/>
> IBM SWG, Hursley, UK
> *smh at uk.ibm.com* <smh at uk.ibm.com>
> tel:+44-1962-815848
> --
>  dfdl-wg mailing list
>  dfdl-wg at ogf.org
>  https://www.ogf.org/mailman/listinfo/dfdl-wg
>
> Unless stated otherwise above:
> IBM United Kingdom Limited - Registered in England and Wales with number
> 741598.
> Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU
>
> Unless stated otherwise above:
> IBM United Kingdom Limited - Registered in England and Wales with number
> 741598.
> Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU
>
> --
>   dfdl-wg mailing list
>   dfdl-wg at ogf.org
>   https://www.ogf.org/mailman/listinfo/dfdl-wg
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.ogf.org/pipermail/dfdl-wg/attachments/20150825/ef7b1ef9/attachment-0001.html>


More information about the dfdl-wg mailing list