[DFDL-WG] could you clarify statement made on the call today

Steve Hanson smh at uk.ibm.com
Wed Jul 24 07:24:01 EDT 2013


Jonathan

Yes that's the principle, but it goes further than that. The DFDL infoset 
is typed, whereas the XML infoset isn't. The results of parsing a 
DFDL-described document and then applying DFDL validation to the resultant 
DFDL infoset is the same as parsing the equivalent XML document and 
applying XML Schema 1.0 validation, where 'results' means 'the validation 
errors that are detected'.

This principle has guided the WG in the design of several features. To 
hide elements from the DFDL infoset requires the use of a 'hidden group' - 
simply doing the obvious thing and adding a dfdl:hidden property to an 
element would break the principle. To get an assert to fail without 
throwing a processing error meant inventing recoverable errors - re-using 
validation errors would break the principle.

For you inspection and sanitization capability, I would recommend looking 
at XDM, the model used by XPath 2.0, XSLT 2.0 and XQuery. I think this is 
the natural higher-level model to adopt for a common DFDL and XML 
framework. I created this OGF document to describe how to map DFDL infoset 
to/from XDM. http://redmine.ogf.org/dmsf_files/8111?download=.

Regards

Steve Hanson
Architect, IBM Data Format Description Language (DFDL)
Co-Chair, OGF DFDL Working Group
IBM SWG, Hursley, UK
smh at uk.ibm.com
tel:+44-1962-815848



From:   "Cranford, Jonathan W." <jcranford at mitre.org>
To:     Steve Hanson/UK/IBM at IBMGB, 
Cc:     "dfdl-wg at ogf.org" <dfdl-wg at ogf.org>
Date:   23/07/2013 18:23
Subject:        could you clarify statement made on the call today



Steve,

Could you clarify a statement made during today's DFDL WG call?

I didn't quite catch the whole statement, but it sounded like you were 
saying a design goal of the WG was that the result of parsing a binary 
format using DFDL would result in a DFDL infoset roughly equivalent to the 
XML infoset obtained by parsing the same data in an XML format.  I don't 
think I quite captured that correctly, but it sounds like an important 
point, and I'd like to understand it further.

For context, I've been asked to look at building an inspection and 
sanitization capability on top of DFDL, so I'm weighing the differences 
between DFDL Infoset and XML Infoset at the moment, and your comparison 
caught my attention.

Thanks in advance,

--
Jonathan W. Cranford 
Senior Information Systems Engineer
The MITRE Corporation (http://www.mitre.org)




Unless stated otherwise above:
IBM United Kingdom Limited - Registered in England and Wales with number 
741598. 
Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.ogf.org/pipermail/dfdl-wg/attachments/20130724/fe96a549/attachment-0001.html>


More information about the dfdl-wg mailing list