[DFDL-WG] Unparsng Asterix format - challenges for DFDL

Mike Beckerle mbeckerle at apache.org
Mon Apr 24 09:03:50 PDT 2023


Asterix is a Eurocontrol aviation-related messaging data format.

We are trying to create a DFDL schema for Asterix.

Asterix messages have a field-spec or fspec at the start. This is a
variable-length set of presence-bit flag bytes at the start of the message.

The fspec flags are a vector of bytes. Each byte contains 7 bits which are
the presence-indicator flags, and an FX bit indicating whether there is
another flags byte.
After those flags are the fields themselves which are, if present, fixed
length. The fields are commonly numbers of various fixed sizes (int, float,
byte, etc.)

The challenge is this. A given flag byte is only needed when unparsing if

   1. a field whose presence bit is in that flag byte is present in the
   message.
   2. a subsequent flag byte is needed. In that case the flag byte could be
   all 0 flags, but the FX bit will be 1 to indicate that there is a
   subsequent flag byte.

As an example, imagine a message has from 1 to 3 flag bytes at the start,
allowing up to 21 optional fields to follow.
Flag byte 2 is needed if any of fields 8-14 exists, or if flag byte 3
exists.
Flag byte 2's FX bit is 1 if flag byte 3 exists.

For the sake of discussion, let's call the presence flags p1 to p21, so the
fspec flags bytes are:

fspec1                           fspec2
 fspec3
p1, p2, p3, p4, p5, p6, p7 fx1 | p8, p9, p10, p11, p12, p13, p14, fx2 |
p15, p16, p17, p18 p19, p20, p21

(fspec3 has a final unused bit. It uses only the first 7 bits of the byte.)

We'd like to compute the entire flag byte vector including the existence of
each flag byte, based on the existence of the subsequent fields, which we
can call f1 to f21.

We do this with dfdl:newVariableInstance like so:
<dfdl:newVariableInstance ref="s:p1Exists">{ fn:exists(../p1)
}</dfdl:newVariableInstance>
...
<dfdl:newVariableInstance ref="s:p21Exists">{ fn:exists(../p21)
}</dfdl:newVariableInstance>

<dfdl:newVariableInstance ref="s:fspec3Exists">{ $s:p15Exists or
$s:p16Exists or ... $s:p21Exists }</dfdl:newVariableInstance>
<dfdl:newVariableInstance ref="s:fspec2Exists">{ $fspec3Exists or
$s:p8Exists or $s:p9Exists or ... $s:p14Exists }</dfdl:newVariableInstance>

That allows us to then easily compute the values of each flag bit, and of
each FX bit.

But we have no way at unparse time to only create the entire fspec2 or
fspec3 element based on $s:fspec2Exists, or $s:fspec3Exists.

It appears it is not possible to do this in DFDL v1.0.  Because
dfdl:occursCountKind 'expression' does not evaluate the expression at
unparse time.

Nor does choice evaluate a choice dispatch key at unparse time.

So it appears we have no way to evaluate an expression at unparse time, and
use the value of that expression to decide to create an element in the
augmented infoset that was not present in the initial infoset at the start
of unparsing.

Hence, we cannot cause the creation of these optional fspec flag bytes
based on the existence of subsequent fields when unparsing.

It is illegal in Asterix for the last flag byte to contain only zero bits.

So I believe DFDL cannot properly compute the fspec flag bytes based on the
existence of the subsequent fields. I think application logic outside of
the DFDL schema has to replicate the logic about whether a flag byte needs
to exist or not, and place the corresponding fspec elements into the
infoset.

It is very close. If we had either the ability to evaluate and select a
choice branch at unparse time based on an expression, or the ability to
evaluate an occursCount expression at unparse time to determine the
existence or not of an element in the infoset, those would enable us to
fully capture the Asterix format.

Mike Beckerle
Apache Daffodil PMC | daffodil.apache.org
OGF DFDL Workgroup Co-Chair | www.ogf.org/ogf/doku.php/standards/dfdl/dfdl
Owl Cyber Defense | www.owlcyberdefense.com
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: text/html
Size: 5562 bytes
Desc: not available
URL: <https://lists.ogf.org/pipermail/dfdl-wg/attachments/20230424/5e95a309/attachment.txt>


More information about the dfdl-wg mailing list