[nm-wg] Specifying units

Martin Swany swany at cis.udel.edu
Mon Sep 26 17:53:31 CDT 2005


Hi Loukik,


> It has come to our notice that the messages in v2 responses do not  
> specify units while giving back measurement data (ex: Bandwidth  
> Utilization and Capacity). Specifying such units is necessary and  
> messages should be enhanced to support this.
>

I think that what you've actually saying is that the PerfSONAR
prototype doesn't return units.  The NM-WG v2 schema dated
20050802 actually includes the units in almost every measurement
and this has been in the schema for a long time.

We've talked a lot about the way to do it.  The current examples
feature units in the datum element, but as you note, that wastes
space.  Some of the examples also depict it as part of the metadata
(which is what I've been mostly in favor of as the least offensive
current option.)  There are  definitely issues with that -- mainly that
the way that the data is stored is really different than the way in
which it is collected.

So, it can go in parameters, but it might be nice if it were able
to be presented in the data section, but not in every datum.
We've discussed this one as well, and thus far there have been
no good solutions (or none that we found generally workable.)


> we could specify it just once..maybe like this:
>
> <perfsonar:data dataUnits="bps">
>

This really requires the data element to be in a specific namespace
or to become an omnibus for all the things that might be common
in the enclosed datum elements.  There could be other numeric
values in the datum that require unit information and we'd have to
add support for each of them to data.

For example, from the current schema:
<iperf:datum interval="2.0-3.0 sec" numBytes="231"  
numBytesUnits="MBytes" value="1.94" valueUnits="MBytes/sec"/>

There are multiple numeric values that need unit qualification.


> <perfsonar:data>
> <perfsonar:units dataUnits="bps"/>
>

This one is even more thorny.  We referred to this as the
"older sibling" model as it makes siblings dependent on
order, and we decided to avoid that so we could use things
like hashes.


> or something else...
>

Something else is really where we are now.   What we
have discussed is a general way to "factor out" common
attributes or elements from a set of datum elements.
CommonTime is an example of this, but a general
mechanism would be nice.  I  proposed something
before where an enclosed element in a datum could
enclose a set of datum elements indicating that this
value was common.  It was greeted with a mixture of
animosity and indifference, often coexisting in the
same person's reaction.

Actually, I think that the newly-discussed "bag of parameters"
might be a partial solution to this problem but it still doesn't
help when the things common to a set of datums are complex
and not simple attributes (like a time range.)


> The second comment is: Choice of units.
>
> After a chat with Jeff on this, I can list out two options
>
> Jeff suggested that: Service uses the units that the data is  
> already in (for ex. in rrd tool, data is in octets per second.  
> Hence service continues to provide data in octets per second) and  
> continues to return data in the same units. However, the units in  
> use should be clearly specified using any of the above suitable  
> methods.
>

Data in RRDTool is not necessarily in octets per second, BTW.  Interface
utilization data is generally fetched from an octet counter (just to  
be pedantic.)


> An option that I would like to propose is usage of units used in  
> common practice. For example, bandwidth, as known to me, is more  
> commonly expressed in bps (and their factors).  A service should  
> hence *reasonably* strive to use the units that are in common  
> practice. Either way, specification of units using any of the  
> suitable methods mentioned previously is absolutely necessary.
>

I agree that reasonably trying to use units in common practice is a
good goal.  Many discussions over many years have led me to
believe that in many cases, common practices aren't so common.
For instance bandwidth is in bits per second when talking about
link capacity or sending bogus test data, but in often in bytes per
second when an application is using the data.   I think that trying to
mandate a "best practice" is a slippery slope. I vote for unambiguous
specification and easy translation.


> Nevertheless, if a service returns capacity and utilization in the  
> same message, it would be nice to have them both in the same units  
> (unlike the current case with Perfsonar prototype where capacity is  
> bps and utilization is octets per second)
>
> Question here is: Which option is ideal? Should we provide capacity  
> and utilization in the same units in our prototype?
>

We can let everyone else weigh in, but if the units are specified (as  
they
can easily be) then why not just divide and convert?  That seems easier
to me than forcing one or the other into a less-than-natural format.

martin






More information about the nm-wg mailing list