[DFDL-WG] Bob & Steph's WTX 'Discriminators' write-up

Stephanie Fetzer sfetzer at us.ibm.com
Wed Jan 13 06:00:08 CST 2010


All:

Here are the  WTX identification/discrimination behaviours 'on-paper' 
written with the help of Bob Connolly our WTX core engine expert.  I had 
attempted to create a single description of the concept but ended up 
needing to split the core behaviour from the modelling options in order to 
get some of the wrinkles discussed. 

The attempt here is to share information on the use of this concept within 
WTX with the (outside of IBM) DFDL WG in such a way that it does not go 
into the specifics of the WTX implementation.

___________________________________________________________________________________________________



WTX use of IDENTIFIERS for distinguishing data

This is a brief description of the use of WTX identifiers to describe 
limitations that could be imposed on the implementation of DFDL 
discriminators.  We will attempt to word this in terms that are not 
specific to WTX and are a bit more XML-centric than we would normally 
describe WTX processing.  This may dilute some of the specificity of the 
descriptions – but the spirit of the concepts is the important thing here.

WHAT IS A DISCRIMINATOR?:
Discriminators allow us to evaluate a data component and determine if it 
'is known to exist'.  The distinction that we are making is that there are 
situations when parsing when we need to determine if some data is an 
'invalid instance' of one data component as opposed to 'not that data 
component'. 

This concept has multiple advantages: it lets us be more specific when 
modeling our data than we could be without the concept, and it allows us 
to terminate one branch of speculative parsing quicker than we might 
otherwise be able to without the concept.

HOW ARE DISCRIMINATORS USED IN WTX CORE:
The core of WTX uses discriminators in the parsing of data in three main 
ways:

1-We are parsing a group which is identifiable in some other way – for 
example, it is a group partitioned by initiator. In this case we 
functionally have a choice between two or more groups.  So the decision 
here is which path or partition/choice group (if any) will be found to 
exist.

If we don’t find the initiator in the data then this partition/choice 
group is not ‘known to exist’.
If we find the initiator in the data and...
-If we find the initiator in the data and we do not have a discriminator 
on a component of the partition/choice group then this type is ‘know to 
exist’ 

-If we find the initiator in the data and 
  if we have a discriminator set on a component of the partition group and
  if the rule on the discriminator and all rules on earlier components 
evaluate to true then this group is 'known to exist'
Note: There is no requirement in WTX that the component on which the 
identifier attribute is set has to have a specified rule - all mandatory 
instances of components have an implied rule of PRESENT($) in WTX.  In 
this document when I say that the rule evaluates to true there is really 
more to this concept - we could say that everything up to that point is 
'valid' meaning that all rules and cardinality constraints and other 
facets enforced (restrictions, size, presentation checks) pass.  In WTX 
the lines between validation and parsing are blurred so these distinctions 
may be a bit different in a DFDL implementation.  But we will at minimum 
need to work this in such a way as we can specify that not only the rules 
evaluate to true but that the cardinality checks also pass.  In DFDL we 
may want to make having a rule a requirement if we choose to use the 
currently documented discriminator construct.

-If we find the initiator in the data and 
  if we have a discriminator set on a component of the partition group and 

  if the rule on the discriminator evaluates to false or if any rule on a 
component prior to the component carrying the discriminator evaluates to 
false then this group is ‘known to not exist’



2-We are evaluating a group and have determined that something is wrong 
with it and the group we are evaluating  has a discriminator and we have 
not found (or evaluated its rule)  yet - so we say that it is ‘known to 
not exist’. 

3-A rule on a component in the group has evaluated to false and that rule 
is on or above the component with the discriminator.  This group is ‘known 
to not exist’.
Note: this is similar to number 2 but number 2 is another reason for 
failure such as missing a mandatory component.

4-A rule on a component in the group has evaluated to true and that rule 
is on the component with the discriminator.  This group is ‘known to 
exist’.
Note: as we are processing the component with the discriminator the 
assumption is that we would not be processing this rule if all previous 
occurring checks and rules checks had not evaluated to true.

5-A rule on a component in the group has evaluated to false and that rule 
is after the component with the discriminator. The rules on the component 
with the discriminator and above rules evaluated to true.  This group is 
‘known to exist’ but is invalid.

6-All rules on all components in the group have evaluated to true 
including the rule on the component with the discriminator.  This group is 
‘known to exist’ and is valid.

7-All rules on all components in the group have evaluated to true – there 
is no component with a discriminator.  This group is ‘known to exist’ and 
is valid. Without the identifier we process the group to the end (last 
component) before determining that it is ‘known to exist’.




HOW ARE DISCRIMINATORS IMPLEMENTED IN WTX MODELS?:

WTX imposes many limitations to the expression of identifiers on the 
model.

-Discriminators may only be placed on the physical representation of a 
group.  That is why we see them on partition groups and sequence groups 
but not on choice groups (or unordered groups – covered below). 

In partitioned groups we have a subtype of each possible group – so each 
possible group may have a discriminator. 

When WTX expresses choice groups it expresses them as a group containing 
all of the possible child groups – so at the top level ‘choice group’ 
there is no component of the actual group content- so no use for a 
discriminator. But each choice which may itself be a group may have a 
discriminator.  Choice groups are special in that the choice model 
construct simply lists the components and only one may occur...at this 
level a discriminator on one of the choices may not be very useful. Inside 
of each choice’s components a discriminator could be used to indicate the 
existence of that choice.

-The WTX UI does not allow discriminators on the components of Unordered 
Groups.  This may be due to the fact that the position of the 
discriminator has significance (all rules at or above the discriminator 
must evaluate to true).  If the group is unordered it would be difficult 
to enforce.  Will need to discuss for DFDL.

-A group may have either zero or one discriminators.  No group may have 
more than one discriminator.

-The discriminator may have two significant parts
o       it’s location (mandatory).  The discriminator is placed on a 
component of a group and makes all of the cardinality and rules at that 
point and above become part of it's concept.
o       it’s rule (optional)
A group with a component which has a discriminator should have some ‘rule’ 
associated with it. In WTX if there is no explicit rule then the implicit 
rule is ‘PRESENT($)’. We will need to decide if such implied rules will be 
allowed in DFDL.


-A group may only have a discriminator on a mandatory component. Once 
again, this impacts a choice group where by definition all components are 
optional – which will not have a discriminator. 

This has been an issue of debate in WTX. We could have implemented 
checking on optional elements quite easily  Over the years this has been 
questioned (as our UI allows them to be placed on optional elements) but 
once we explained the way the engine worked no customers perceived this as 
a deficiency.  In DFDL we will need to determine if this is needed. 

-In WTX we do allow a discriminator to be placed on a mandatory fixed size 
array (a repeating mandatory component with n:n cardinality).  It’s 
component rule can either refer to the entirety of the array (PRESENT($) 
meaning the whole of the array is present) or can call out a specific rule 
against one if the iterations.  This is not done often in practice.

-In WTX it is common to have multiple levels of discriminators when we are 
working with nested groups.

Stephanie Fetzer
WebSphere Common Transformation
Industry Packs - Software Engineer




-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.ogf.org/pipermail/dfdl-wg/attachments/20100113/a529dfa8/attachment-0001.html 


More information about the dfdl-wg mailing list