[DFDL-WG] Bob & Steph's WTX 'Discriminators' write-up
Stephanie Fetzer
sfetzer at us.ibm.com
Wed Jan 13 06:00:08 CST 2010
All:
Here are the WTX identification/discrimination behaviours 'on-paper'
written with the help of Bob Connolly our WTX core engine expert. I had
attempted to create a single description of the concept but ended up
needing to split the core behaviour from the modelling options in order to
get some of the wrinkles discussed.
The attempt here is to share information on the use of this concept within
WTX with the (outside of IBM) DFDL WG in such a way that it does not go
into the specifics of the WTX implementation.
___________________________________________________________________________________________________
WTX use of IDENTIFIERS for distinguishing data
This is a brief description of the use of WTX identifiers to describe
limitations that could be imposed on the implementation of DFDL
discriminators. We will attempt to word this in terms that are not
specific to WTX and are a bit more XML-centric than we would normally
describe WTX processing. This may dilute some of the specificity of the
descriptions – but the spirit of the concepts is the important thing here.
WHAT IS A DISCRIMINATOR?:
Discriminators allow us to evaluate a data component and determine if it
'is known to exist'. The distinction that we are making is that there are
situations when parsing when we need to determine if some data is an
'invalid instance' of one data component as opposed to 'not that data
component'.
This concept has multiple advantages: it lets us be more specific when
modeling our data than we could be without the concept, and it allows us
to terminate one branch of speculative parsing quicker than we might
otherwise be able to without the concept.
HOW ARE DISCRIMINATORS USED IN WTX CORE:
The core of WTX uses discriminators in the parsing of data in three main
ways:
1-We are parsing a group which is identifiable in some other way – for
example, it is a group partitioned by initiator. In this case we
functionally have a choice between two or more groups. So the decision
here is which path or partition/choice group (if any) will be found to
exist.
If we don’t find the initiator in the data then this partition/choice
group is not ‘known to exist’.
If we find the initiator in the data and...
-If we find the initiator in the data and we do not have a discriminator
on a component of the partition/choice group then this type is ‘know to
exist’
-If we find the initiator in the data and
if we have a discriminator set on a component of the partition group and
if the rule on the discriminator and all rules on earlier components
evaluate to true then this group is 'known to exist'
Note: There is no requirement in WTX that the component on which the
identifier attribute is set has to have a specified rule - all mandatory
instances of components have an implied rule of PRESENT($) in WTX. In
this document when I say that the rule evaluates to true there is really
more to this concept - we could say that everything up to that point is
'valid' meaning that all rules and cardinality constraints and other
facets enforced (restrictions, size, presentation checks) pass. In WTX
the lines between validation and parsing are blurred so these distinctions
may be a bit different in a DFDL implementation. But we will at minimum
need to work this in such a way as we can specify that not only the rules
evaluate to true but that the cardinality checks also pass. In DFDL we
may want to make having a rule a requirement if we choose to use the
currently documented discriminator construct.
-If we find the initiator in the data and
if we have a discriminator set on a component of the partition group and
if the rule on the discriminator evaluates to false or if any rule on a
component prior to the component carrying the discriminator evaluates to
false then this group is ‘known to not exist’
2-We are evaluating a group and have determined that something is wrong
with it and the group we are evaluating has a discriminator and we have
not found (or evaluated its rule) yet - so we say that it is ‘known to
not exist’.
3-A rule on a component in the group has evaluated to false and that rule
is on or above the component with the discriminator. This group is ‘known
to not exist’.
Note: this is similar to number 2 but number 2 is another reason for
failure such as missing a mandatory component.
4-A rule on a component in the group has evaluated to true and that rule
is on the component with the discriminator. This group is ‘known to
exist’.
Note: as we are processing the component with the discriminator the
assumption is that we would not be processing this rule if all previous
occurring checks and rules checks had not evaluated to true.
5-A rule on a component in the group has evaluated to false and that rule
is after the component with the discriminator. The rules on the component
with the discriminator and above rules evaluated to true. This group is
‘known to exist’ but is invalid.
6-All rules on all components in the group have evaluated to true
including the rule on the component with the discriminator. This group is
‘known to exist’ and is valid.
7-All rules on all components in the group have evaluated to true – there
is no component with a discriminator. This group is ‘known to exist’ and
is valid. Without the identifier we process the group to the end (last
component) before determining that it is ‘known to exist’.
HOW ARE DISCRIMINATORS IMPLEMENTED IN WTX MODELS?:
WTX imposes many limitations to the expression of identifiers on the
model.
-Discriminators may only be placed on the physical representation of a
group. That is why we see them on partition groups and sequence groups
but not on choice groups (or unordered groups – covered below).
In partitioned groups we have a subtype of each possible group – so each
possible group may have a discriminator.
When WTX expresses choice groups it expresses them as a group containing
all of the possible child groups – so at the top level ‘choice group’
there is no component of the actual group content- so no use for a
discriminator. But each choice which may itself be a group may have a
discriminator. Choice groups are special in that the choice model
construct simply lists the components and only one may occur...at this
level a discriminator on one of the choices may not be very useful. Inside
of each choice’s components a discriminator could be used to indicate the
existence of that choice.
-The WTX UI does not allow discriminators on the components of Unordered
Groups. This may be due to the fact that the position of the
discriminator has significance (all rules at or above the discriminator
must evaluate to true). If the group is unordered it would be difficult
to enforce. Will need to discuss for DFDL.
-A group may have either zero or one discriminators. No group may have
more than one discriminator.
-The discriminator may have two significant parts
o it’s location (mandatory). The discriminator is placed on a
component of a group and makes all of the cardinality and rules at that
point and above become part of it's concept.
o it’s rule (optional)
A group with a component which has a discriminator should have some ‘rule’
associated with it. In WTX if there is no explicit rule then the implicit
rule is ‘PRESENT($)’. We will need to decide if such implied rules will be
allowed in DFDL.
-A group may only have a discriminator on a mandatory component. Once
again, this impacts a choice group where by definition all components are
optional – which will not have a discriminator.
This has been an issue of debate in WTX. We could have implemented
checking on optional elements quite easily Over the years this has been
questioned (as our UI allows them to be placed on optional elements) but
once we explained the way the engine worked no customers perceived this as
a deficiency. In DFDL we will need to determine if this is needed.
-In WTX we do allow a discriminator to be placed on a mandatory fixed size
array (a repeating mandatory component with n:n cardinality). It’s
component rule can either refer to the entirety of the array (PRESENT($)
meaning the whole of the array is present) or can call out a specific rule
against one if the iterations. This is not done often in practice.
-In WTX it is common to have multiple levels of discriminators when we are
working with nested groups.
Stephanie Fetzer
WebSphere Common Transformation
Industry Packs - Software Engineer
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.ogf.org/pipermail/dfdl-wg/attachments/20100113/a529dfa8/attachment-0001.html
More information about the dfdl-wg
mailing list