[jsdl-wg] (Param Sweep) - Stephen Pickles' public comment

Mon Jan 19 08:33:08 CST 2009

An afterthought:

I told a small lie in my last posting when I said that all the problems
of (start, end, increment) loops go away in floating-point decimal.

The problems that remain are exactly the ones that led to the
deprecation of loops with real limits and stride in Fortran90.
They are a consequence of being limited to finite precision.

Consider, for example, a loop where the start value is very large
(e.g. near the maximum supported by the precision) and the
increment value is very small. You can have a situation where
adding the small value to the large value won't change the large
value in finite precision.

Round-off can also break the ideal behaviour of this style of loop
in other ways, e.g. computing the trip count (i.e. cardinality) from the
start, end, and increment can fail to give the same trip count as actually
executing the loop.

Of course, you can also construct a pathological loop by specifying
start, increment and count. If increment is too small with respect to
start, you generate count identical values.

So a good quality implementation will need to defend itself against
pathological loop limits. This will be true to some extent regardless
of whether you choose option (B) or (C) or something else.

I therefore suggest providing for an "insufficient precision" fault-type in
the specification of the JSDL parameter sweep extension.

It can be difficult to tell in advance if a loop will be pathological.

Some checks suggest themselves. If the loop generates two
consecutive identical values the implementation should terminate
the loop and throw a fault. If the loop generates a different cardinality
than the pre-computed value, the implementation should throw a fault.

A numerical analyst could work out a better set of checks.

Best wishes,

Stephen

2009/1/19 Stephen Pickles <stephen.m.pickles at googlemail.com>:
> Dear All,
>
> To cut a long story short, I think that Geoff' is talking about a specification
> of LoopDouble that is subtly different to the one that went to public comment.
>
> However, I also think that the specification that Geoff is defending
> (and has implemented) is actually BETTER than the published draft.
>
> I shall first re-iterate why I believe the current specification to be flawed.
> Then I shall try to explain the kind of changes that would be needed to fix it.
>
> Bear with me.
>
> Why the current specification is flawed:
> ----------------------------------------------------
>
> In section 4.3, which is entitled, significantly, "LoopDouble", we read:
>
> <blockquote>
> The LoopDouble function provides an ordered list of double datatype values
> (corresponding to the IEEE double-precision 64-bit floating point type)...
> </blockquote>
>
> and later:
>
> <blockquote>
> The permissible value range for xsd:double is defined as consisting of
> "... the values m  2^e,  where m is an integer whose absolute value is less
> than 2^53, and e is an integer between -1075 and 970, inclusive
> </blockquote>
>
> What all this implies (at least to this reader) is that the LoopDouble
> function deals with _binary_ 64-bit floating point numbers.
> It would seem therefore, taking the specification at face value, that
> an implementor of LoopDouble is allowed, even encouraged, to perform
> the necessary calculations in IEEE binary floating point arithmetic.
> It follows that conversion between the decimal string representations
> (xsd:double) and the nearest exact binary floating point must (or at least,
> is allowed to) happen in both directions. Now, as it is rather exceptional
> for any floating point decimal to have an exact representation in floating
> point binary (and vice versa), you end up with the problems I mentioned
> in my public comments and at a previous JSDL conference call, i.e.
> (1) how to determine cardinality
> (2) how to decide whether an exception "hits" the list generated by LoopDouble
> not to mention problems with reproducibility (across different implementations)
> that the slightest departure from IEEE compliance can cause.
>
> These problems have nothing to do with the accuracy of binary floating point
> arithmetic - only in infinite precision can you do exact conversions between
> base 2 and base 10 floating point numbers.
>
> Geoff''s comments don't change my position on this. As long as the specification
> is dealing explicitly or implicitly with binary floating point numbers, you need
> to change something to recover certainty about cardinality and reproducibility
> of exception-matching. My suggestion, which Geoff' calls option (B), addresses
> the cardinality problem only.
>
> Why Geoff's way is better
> ----------------------------------
>
> All the aforesaid problems arise from conversion between decimal and
> binary floating point. They go away if the xsd:double values in LoopDouble
> (i.e. both the input parameters and the ordered list of values it returns) are
> interpreted as exact floating point decimal values, and the arithmetic is
> performed (with sufficient precision) also in floating point decimal.
>
> This offers an alternative resolution to option (B), i.e. to prescribe
> the use of _decimal_ floating point arithmetic in LoopDouble.
>
> Let me call this option (C).
>
> Is the JSDL working group willing to prescribe the use of decimal arithmetic?
>
> Clearly, Java implementors should have no problem. Implementors in other
> languages should be OK if they have an equivalent of Java's BigDecimal class.
> Even without such library support, it should not be too difficult to implement
> using only integer arithmetic - you don't need the full armoury of arithmetic
> operations; just extended precision addition and/or subtraction should be
> sufficient. I've done such things in the past.
>
> How to fix the spec for option (C)
> --------------------------------------------
>
> These are my suggestions for changes to the specification to
> bring it in line with what Geoff has presumably implemented.
>
> 1. Use language that makes it explicit that the values involved
> are exact floating-point decimals, and that arithmetic MUST
> be performed in floating-point decimal. Get rid of all mentions
> of "double" (which conveys 64-bit base-2 floating point arithmetic
> to most readers).
>
> Of course, a user of the parameter sweep extension still can and will
> convert (approximately) the decimal string into some base-2
> internal representation once it reaches their application code.
> What they do with it is up to them.
>
> 2. Change the name of the LoopDouble function to LoopDecimal.
> (Sorry, Geoff, you would need to change SOME code, even if
> it's just a tag name.)
>
> 3. Include an informational reference to IEEE 754-2008.
>
> 4. You could choose to limit the number of significant decimal digits that
> an implementation must support.
>
> 5. You can retain xsd:double to constrain the allowable parameters
> and return values. Although xsd:double was designed as an approximate
> decimal string representation of some other binary floating point number,
> see e.g.
>  http://books.xmlschemata.org/relaxng/ch19-77065.html
> It has a format that's (almost) exactly what you want.
>
> (Although, to be pedantic, you it would be wise  to constrain it further to
> disallow NaN, INF, and -INF as loop start, end and increment values.)
>
> 6. Tighten up the language on termination conditions. And negative
> increments. Disallow zero increments. See my public comments.
>
> 7. Once it's clear that you're working in the realm of floating point decimals,
> you can determine the trip count of an option (A) style loop by inspection.
> I strongly recommend that you include in the specification a formula
> for calculating the loop cardinality. I leave this as an exercise for the
> reader (hint: see the Fortran specification).
>
> I hope this is clear. I hope it helps.
>
> Stephen
>
> PS I noticed some other errors while re-reading the spec:
> (i) The mention of xsd:decimal in section 4.3 is spurious, misleading
> and actually incorrect.
> (ii) There's a lot of jumping over "lay dogs" in the examples, where
> "lazy dogs" is surely intended.
>