[DFDL-WG] Generic Schema-based transformation language: example based on DFDL capabilities

Beckerle, Mike mbeckerle at owlcyberdefense.com
Thu Jun 17 12:31:08 EDT 2021


DFDL isn't normally thought of as a data transformation language.

Yet it has computed elements, hidden groups, expressions that can refer to elements in arrays using the index position within the current array (via the dfdl:occursIndex() function).

These result in it having substantial data transformation capabilities.

For a while I have said that I bet one can invert a matrix in DFDL.

So I finally created an example that does so.  It takes a representation of data as a pair of lists and creates a logical infoset that is a list of pairs.

By keeping the infoset of the pair of lists in a DFDL hidden group, this DFDL schema effectively parses the file of data, and transforms one infoset into another that is entirely different in shape.

https://github.com/OpenDFDL/examples/tree/master/pairsTransform

The conclusion of this little experiment is that while this is possible for parsing, you can't invert the process perfectly in unparsing due to some DFDL v1.0 restrictions.

Specifically, the fact that when dfdl:occursCountKind is 'expression' at unparse time the expression is NOT evaluated (DFDL Spec 16.1.4) prevents this example from allocating the representation infoset arrays inside a hidden group at unparse time.

Previously, we had identified the lack of unparse-time evaluation of dfdl:choiceDispatchKey was a similar limitation on unparsing things in hidden groups, and this issue with arrays is directly analogous.

I may do experiments in Daffodil to lift those restrictions just to play around with the idea.

The concept here is not that DFDL should become a better transformation language.

Rather, it is about "Schema-based Transformation" generally. That is to say, there is nothing in these transformation capabilities in DFDL that is about the format of data, per se, and no reason they cannot be lifted out of DFDL and used, for example, in ordinary XML schema to express XML transformations.

This is quite interesting given that typical XML transformation languages such as XSLT and XQuery are both template/instance-document based, not schema based.


Mike Beckerle | Principal Engineer

[cid:eb186e2a-a515-4148-a5af-3d5f693737bd]

mbeckerle at owlcyberdefense.com<mailto:bhummel at owlcyberdefense.com>

P +1-781-330-0412

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.ogf.org/pipermail/dfdl-wg/attachments/20210617/d1d46f9c/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: Outlook-apdzrv54.png
Type: image/png
Size: 6734 bytes
Desc: Outlook-apdzrv54.png
URL: <http://www.ogf.org/pipermail/dfdl-wg/attachments/20210617/d1d46f9c/attachment.png>


More information about the dfdl-wg mailing list