[nm-wg] Specifying units

Dan Gunter dkgunter at lbl.gov
Wed Sep 28 11:00:57 CDT 2005


Andrea Westerinen (andreaw) wrote:

>In the DMTF, we address the units issue by defining it in the
>meta-data/schema and requiring that implementations return the dictated
>units (btw, percent is a type of "unit").  So, having the schema
>definition provides the unit information.  This has required
>implementations to convert kbps to bps, etc. but saves the client from
>doing instance-level conversions.
>
>  
>
Thanks for the input, Andrea, it is good to know how other groups are 
handling this issue.

My $0.02. To me, this is a somewhat XML-schema-centric view of the 
world. Other schema languages, e.g. Relax-NG schemas, do not modify the 
infoset (do not define default values for attributes). The disadvantage 
is potentially redundant information in each message. The advantage is 
that the converse of the above, that not having the schema means you 
don't have the unit information, is not a problem (also consider what 
happens when the units change between schema versions). Martin has also 
argued that whether the sender or receiver is the more natural 
(convenient and/or correct) place for unit conversion depends on the 
situation. To throw a spanner into the works, I would even argue that 
just because you have two measurements with the same units and value 
does not necessarily mean that you have, even approximately, the same 
result without considering issues of resolution, accuracy, and 
precision. Oh dear, I've said too much, I can hear the black helicopters 
circling overhead.. :-)

-Dan

>Andrea
>
>  
>
>>-----Original Message-----
>>From: owner-nm-wg at ggf.org [mailto:owner-nm-wg at ggf.org] On 
>>Behalf Of Martin Swany
>>Sent: Monday, September 26, 2005 3:54 PM
>>To: nm-wg at ggf.org
>>Subject: Re: [nm-wg] Specifying units
>>
>>Hi Loukik,
>>
>>
>>    
>>
>>>It has come to our notice that the messages in v2 responses do not 
>>>specify units while giving back measurement data (ex: Bandwidth 
>>>Utilization and Capacity). Specifying such units is necessary and 
>>>messages should be enhanced to support this.
>>>
>>>      
>>>
>>I think that what you've actually saying is that the 
>>PerfSONAR prototype doesn't return units.  The NM-WG v2 schema dated
>>20050802 actually includes the units in almost every 
>>measurement and this has been in the schema for a long time.
>>
>>We've talked a lot about the way to do it.  The current 
>>examples feature units in the datum element, but as you note, 
>>that wastes space.  Some of the examples also depict it as 
>>part of the metadata (which is what I've been mostly in favor 
>>of as the least offensive current option.)  There are  
>>definitely issues with that -- mainly that the way that the 
>>data is stored is really different than the way in which it 
>>is collected.
>>
>>So, it can go in parameters, but it might be nice if it were 
>>able to be presented in the data section, but not in every datum.
>>We've discussed this one as well, and thus far there have 
>>been no good solutions (or none that we found generally workable.)
>>
>>
>>    
>>
>>>we could specify it just once..maybe like this:
>>>
>>><perfsonar:data dataUnits="bps">
>>>
>>>      
>>>
>>This really requires the data element to be in a specific 
>>namespace or to become an omnibus for all the things that 
>>might be common in the enclosed datum elements.  There could 
>>be other numeric values in the datum that require unit 
>>information and we'd have to add support for each of them to data.
>>
>>For example, from the current schema:
>><iperf:datum interval="2.0-3.0 sec" numBytes="231"  
>>numBytesUnits="MBytes" value="1.94" valueUnits="MBytes/sec"/>
>>
>>There are multiple numeric values that need unit qualification.
>>
>>
>>    
>>
>>><perfsonar:data>
>>><perfsonar:units dataUnits="bps"/>
>>>
>>>      
>>>
>>This one is even more thorny.  We referred to this as the 
>>"older sibling" model as it makes siblings dependent on 
>>order, and we decided to avoid that so we could use things 
>>like hashes.
>>
>>
>>    
>>
>>>or something else...
>>>
>>>      
>>>
>>Something else is really where we are now.   What we
>>have discussed is a general way to "factor out" common 
>>attributes or elements from a set of datum elements.
>>CommonTime is an example of this, but a general mechanism 
>>would be nice.  I  proposed something before where an 
>>enclosed element in a datum could enclose a set of datum 
>>elements indicating that this value was common.  It was 
>>greeted with a mixture of animosity and indifference, often 
>>coexisting in the same person's reaction.
>>
>>Actually, I think that the newly-discussed "bag of parameters"
>>might be a partial solution to this problem but it still 
>>doesn't help when the things common to a set of datums are 
>>complex and not simple attributes (like a time range.)
>>
>>
>>    
>>
>>>The second comment is: Choice of units.
>>>
>>>After a chat with Jeff on this, I can list out two options
>>>
>>>Jeff suggested that: Service uses the units that the data 
>>>      
>>>
>>is already 
>>    
>>
>>>in (for ex. in rrd tool, data is in octets per second.
>>>Hence service continues to provide data in octets per second) and 
>>>continues to return data in the same units. However, the 
>>>      
>>>
>>units in use 
>>    
>>
>>>should be clearly specified using any of the above suitable methods.
>>>
>>>      
>>>
>>Data in RRDTool is not necessarily in octets per second, BTW. 
>> Interface utilization data is generally fetched from an 
>>octet counter (just to be pedantic.)
>>
>>
>>    
>>
>>>An option that I would like to propose is usage of units used in 
>>>common practice. For example, bandwidth, as known to me, is more 
>>>commonly expressed in bps (and their factors).  A service 
>>>      
>>>
>>should hence 
>>    
>>
>>>*reasonably* strive to use the units that are in common practice. 
>>>Either way, specification of units using any of the 
>>>      
>>>
>>suitable methods 
>>    
>>
>>>mentioned previously is absolutely necessary.
>>>
>>>      
>>>
>>I agree that reasonably trying to use units in common 
>>practice is a good goal.  Many discussions over many years 
>>have led me to believe that in many cases, common practices 
>>aren't so common.
>>For instance bandwidth is in bits per second when talking 
>>about link capacity or sending bogus test data, but in often 
>>in bytes per
>>second when an application is using the data.   I think that trying to
>>mandate a "best practice" is a slippery slope. I vote for 
>>unambiguous specification and easy translation.
>>
>>
>>    
>>
>>>Nevertheless, if a service returns capacity and utilization in the 
>>>same message, it would be nice to have them both in the same units 
>>>(unlike the current case with Perfsonar prototype where capacity is 
>>>bps and utilization is octets per second)
>>>
>>>Question here is: Which option is ideal? Should we provide capacity 
>>>and utilization in the same units in our prototype?
>>>
>>>      
>>>
>>We can let everyone else weigh in, but if the units are 
>>specified (as  
>>they
>>can easily be) then why not just divide and convert?  That 
>>seems easier
>>to me than forcing one or the other into a less-than-natural format.
>>
>>martin
>>
>>    
>>
>
>  
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.ogf.org/pipermail/nm-wg/attachments/20050928/887fb8a1/attachment.htm 


More information about the nm-wg mailing list