[DFDL-WG] RFC2119 keywords and capitalization e.g., must vs. MUST

Steve Hanson smh at uk.ibm.com
Thu Sep 17 15:30:05 EDT 2020


As discussed on WG call... let's upper-case only the category (1) cases 
for all affected words, and add a section explaining this. 

Regards
 
Steve Hanson
IBM Hybrid Integration, Hursley, UK
Architect, IBM DFDL
Co-Chair, OGF DFDL Working Group
smh at uk.ibm.com
tel:+44-1962-815848
mob:+44-7717-378890
Note: I work Tuesday to Friday 



From:   Mike Beckerle <mbeckerle.dfdl at gmail.com>
To:     DFDL-WG <dfdl-wg at ogf.org>
Date:   15/09/2020 17:49
Subject:        [EXTERNAL] [DFDL-WG] RFC2119 keywords and capitalization 
e.g., must vs. MUST
Sent by:        "dfdl-wg" <dfdl-wg-bounces at ogf.org>



I am looking for guidance here while editing the DFDL spec. 

An open topic has been the RFC2119 keywords. Many specs just refer to 
these and say "we use RFC2119 conventions."

We can't do that. We use keywords MUST, SHOULD, SHALL, etc. in 3 ways.

1) Discussing requirements/desires for all implementations - e.g., all 
implementations MUST detect schema definition errors. 

2) Discussing "requirements" a DFDL schema author must ensure are followed 
in order to avoid getting a SDE. E.g., "The DFDL fillByte property must be 
a single byte or single character."  Many times we follow this sort of 
requirement by "and it is an SDE otherwise." It is implied that 
implementations must detect this SDE and provide a diagnostic to the user. 


3) Discussing "requirements" that data must conform to, in order for the 
data to conform to a DFDL schema. E.g., "The representation must be 
followed by the terminating delimiter." The assumption in these cases is 
that it is a processing error if the terminator is not found. It is the 
way these runtime-behaviors are expressed that is of note. They are not 
phrased in terms of how implementations must behave, but rather in terms 
of what users should expect. My use of the word "should" here is 
intentional - we do in fact use "should" in many cases to imply required 
behavior, not suggested behavior. 

These different categories of requirement areas are normal to a 
description of a language standard, and category (3) here is to some 
degree unique to a description language that describes other formal 
objects that may or may not in fact correspond to the description. E.g., 
there are unit test frameworks for code where one writes things like 
"add(1, 1) should equal 2", where "should" means the same thing as "shall" 
or "must". 

Next: the term "may" is used to mean 

(1) optional per RFC2119

(2) "allows" or "admits" when describing DFDL schemas E.g., "If the nil 
representation may not be zero length, but the empty representation is 
zero length, then the absent representation cannot occur because 
zero-length will be interpreted as the empty representation."

(3) discussing possible alternatives for data E.g, "Encodings which are 5, 
6, 7, and 9 bits wide are rare, but do exist, so the overall length of the 
content region may not be a multiple of 8 bits wide."

My current working draft (r27, not yet on redmine, but soon) has put all 
the category 1 usages in all upper case (for MUST, SHOULD, SHALL, MAY, 
etc.)

I have left all the category (2) and (3) as lower case, as there are many 
of them, and they really do need to be distinguished from category (1). 

We have to explain these 3 different kinds of usage if we're going to 
reference RFC2119 at all. But I am seriously considering whether we should 
just say we DON"T use RFC2119 conventions but rather explain these 3 
categories of usage. 

Next the terms required and optional in RFC2119 are used 

(1) about requirements that MUST be implemented vs are OPTIONAL but 
RECOMMENDED, and if implemented are REQUIRED to be implemented in a 
conforming manner.

(2) about language features that are optional for use. E.g., the testKind 
property of a dfdl:assert annotation is optional, and assumed to be 
"expression", not "pattern" if not provided.

(3) as formal terms to discuss when default values are used or not for 
array/optional elements. 

I propose that category (1) above is again all caps, but categories (2) 
and (3) are not. 

I have not made any such changes as yet. 

So, what I'm looking for is feedback on these ideas.  What I am planning 
to do is in the section on Terminology and Conventions to explain these 3 
cateogories of "requirements" and mention that RFC2119 convention applies 
only to the first category. 

Comments?


 




Mike Beckerle | OGF DFDL Workgroup Co-Chair | Owl Cyber Defense | 
www.owlcyberdefense.com
Please note: Contributions to the DFDL Workgroup's email discussions are 
subject to the OGF Intellectual Property Policy
--
  dfdl-wg mailing list
  dfdl-wg at ogf.org
  
https://www.ogf.org/mailman/listinfo/dfdl-wg 



Unless stated otherwise above:
IBM United Kingdom Limited - Registered in England and Wales with number 
741598. 
Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.ogf.org/pipermail/dfdl-wg/attachments/20200917/8cc23aa5/attachment.html>


More information about the dfdl-wg mailing list