[DFDL-WG] issue: scannable and 'results are not predictable'

Mike Beckerle mbeckerle.dfdl at gmail.com
Wed Jul 10 13:22:47 EDT 2013


I was editing the definition of scannable into the glossary and when I
looked at usage of 'scannable' in testPattern I found this:

In the box for testPattern it says if the data is not scannable "the
results are not predictable".

Is that sufficient?

We can sometimes statically determine that the schema says the data should
all be scannable (e.g., no change of encoding, no binary elements), and
that would rule out one non-predictability.  So, if data is non-scannable
in the sense that the schema contains say, a binary element, we can issue
an SDE if lengthKind is pattern or a testPattern assert is being used.

We could also SDE if runtime-valued encoding properties are used and the
encoding changes inside a scannable context.

Well, I guess testKind pattern asserts/discriminators are an issue because
they may look only at the first part of the data of a complex component, so
they don't require everything to be scannable, only the part the regex
actually examines. So in this case it's user-beware, and if non-scannable I
suppose we could issue a warning.

But the spec does not say this is an SDE or warning currently. It just says
results are not predictable.

There is also the fact that the data might be broken, i.e., the schema
might say the data is scannable, but at parse time character decode errors
occur.  I believe our policy on this is that these cause processing errors.
This really is orthogonal to scannable, which is a property of a schema
component.

Comments?

-- 
Mike Beckerle | OGF DFDL Workgroup Co-Chair | Tresys Technology |
www.tresys.com
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.ogf.org/pipermail/dfdl-wg/attachments/20130710/2508f407/attachment.html>


More information about the dfdl-wg mailing list