[jsdl-wg] (Param Sweep) - Stephen Pickles' public comment

Mon Jan 19 06:45:03 CST 2009

Dear All,

To cut a long story short, I think that Geoff' is talking about a specification
of LoopDouble that is subtly different to the one that went to public comment.

However, I also think that the specification that Geoff is defending
(and has implemented) is actually BETTER than the published draft.

I shall first re-iterate why I believe the current specification to be flawed.
Then I shall try to explain the kind of changes that would be needed to fix it.

Bear with me.

Why the current specification is flawed:
----------------------------------------------------

In section 4.3, which is entitled, significantly, "LoopDouble", we read:

<blockquote>
The LoopDouble function provides an ordered list of double datatype values
(corresponding to the IEEE double-precision 64-bit floating point type)...
</blockquote>

and later:

<blockquote>
The permissible value range for xsd:double is defined as consisting of
"... the values m  2^e,  where m is an integer whose absolute value is less
than 2^53, and e is an integer between -1075 and 970, inclusive
</blockquote>

What all this implies (at least to this reader) is that the LoopDouble
function deals with _binary_ 64-bit floating point numbers.
It would seem therefore, taking the specification at face value, that
an implementor of LoopDouble is allowed, even encouraged, to perform
the necessary calculations in IEEE binary floating point arithmetic.
It follows that conversion between the decimal string representations
(xsd:double) and the nearest exact binary floating point must (or at least,
is allowed to) happen in both directions. Now, as it is rather exceptional
for any floating point decimal to have an exact representation in floating
point binary (and vice versa), you end up with the problems I mentioned
in my public comments and at a previous JSDL conference call, i.e.
(1) how to determine cardinality
(2) how to decide whether an exception "hits" the list generated by LoopDouble
not to mention problems with reproducibility (across different implementations)
that the slightest departure from IEEE compliance can cause.

These problems have nothing to do with the accuracy of binary floating point
arithmetic - only in infinite precision can you do exact conversions between
base 2 and base 10 floating point numbers.

Geoff''s comments don't change my position on this. As long as the specification
is dealing explicitly or implicitly with binary floating point numbers, you need
to change something to recover certainty about cardinality and reproducibility
of exception-matching. My suggestion, which Geoff' calls option (B), addresses
the cardinality problem only.

Why Geoff's way is better
----------------------------------

All the aforesaid problems arise from conversion between decimal and
binary floating point. They go away if the xsd:double values in LoopDouble
(i.e. both the input parameters and the ordered list of values it returns) are
interpreted as exact floating point decimal values, and the arithmetic is
performed (with sufficient precision) also in floating point decimal.

This offers an alternative resolution to option (B), i.e. to prescribe
the use of _decimal_ floating point arithmetic in LoopDouble.

Let me call this option (C).

Is the JSDL working group willing to prescribe the use of decimal arithmetic?

Clearly, Java implementors should have no problem. Implementors in other
languages should be OK if they have an equivalent of Java's BigDecimal class.
Even without such library support, it should not be too difficult to implement
using only integer arithmetic - you don't need the full armoury of arithmetic
operations; just extended precision addition and/or subtraction should be
sufficient. I've done such things in the past.

How to fix the spec for option (C)
--------------------------------------------

These are my suggestions for changes to the specification to
bring it in line with what Geoff has presumably implemented.

1. Use language that makes it explicit that the values involved
are exact floating-point decimals, and that arithmetic MUST
be performed in floating-point decimal. Get rid of all mentions
of "double" (which conveys 64-bit base-2 floating point arithmetic
to most readers).

Of course, a user of the parameter sweep extension still can and will
convert (approximately) the decimal string into some base-2
internal representation once it reaches their application code.
What they do with it is up to them.

2. Change the name of the LoopDouble function to LoopDecimal.
(Sorry, Geoff, you would need to change SOME code, even if
it's just a tag name.)

3. Include an informational reference to IEEE 754-2008.

4. You could choose to limit the number of significant decimal digits that
an implementation must support.

5. You can retain xsd:double to constrain the allowable parameters
and return values. Although xsd:double was designed as an approximate
decimal string representation of some other binary floating point number,
see e.g.
  http://books.xmlschemata.org/relaxng/ch19-77065.html
It has a format that's (almost) exactly what you want.

(Although, to be pedantic, you it would be wise  to constrain it further to
disallow NaN, INF, and -INF as loop start, end and increment values.)

6. Tighten up the language on termination conditions. And negative
increments. Disallow zero increments. See my public comments.

7. Once it's clear that you're working in the realm of floating point decimals,
you can determine the trip count of an option (A) style loop by inspection.
I strongly recommend that you include in the specification a formula
for calculating the loop cardinality. I leave this as an exercise for the
reader (hint: see the Fortran specification).

I hope this is clear. I hope it helps.

Stephen

PS I noticed some other errors while re-reading the spec:
(i) The mention of xsd:decimal in section 4.3 is spurious, misleading
and actually incorrect.
(ii) There's a lot of jumping over "lay dogs" in the examples, where
"lazy dogs" is surely intended.