[drmaa-wg] C Binding

Thu Jan 13 02:48:14 CST 2005

> In a previous e-mail, Daniel Templeton wrote:
>> I've started to work on the C binding doc.  
>> There's something I've been dying to do since I started with DRMAA, 
>> and now that I have my hands on the C spec, I think it's a good 
>> time to do it.
> 
> Alarming attitude.

That's me.  I'm a wild man.  I also rip the tags off of matresses.

> It is my understanding that the C spec is being revised with the 
> intent of making it a stand-alone document.   Although I would
> expect this to entail vast changes to the document, the technical
> content should be unchanged.
> This will make revision comparisons very difficult.
> 
> Please continue to propose changes during your editing cycle
> so they can be discussed and potentially incorporated into
> the spec after the changes to make it a stand-alone ducument.

A very reasonable suggestion.  Sure.

>> That is namely to fix the drmaa_attr_*_t structures.  
>> They are currently unusable.  In order to do anything useful
>> with them, we need either a way to get the count of the elements
>> in the structure or to reset the cursor to the beginning of the
>> list or both.
>> 
>> 
>> Here's are the 6 new functions I propose:
>> 
>> int drmaa_get_num_attr_names(drmaa_attr_names_t* values, int* count);
>> int drmaa_get_num_attr_values(drmaa_attr_values_t* values, int* count);
>> int drmaa_get_num_attr_ids(drmaa_attr_ids_t* values, int* count);
>> int drmaa_reset_attr_names(drmaa_attr_names_t* values);
>> int drmaa_reset_attr_values(drmaa_attr_values_t* values);
>> int drmaa_reset_attr_ids(drmaa_attr_ids_t* values);
>> 
>> I strongly recommend we at least add one set or the other.  My
>> preference would be both, but I think the first set is the more
>> important.
> 
> The drmaa_get_next_* routines provide a mechanism to obtain
> every element in the data structure (once).
> I would expect many, if not most, callers of these routines
> to insert the elements into a data structure which is suitable
> for their usage.
> I am of the opinion that the existing routines are sufficient
> for incrementally constructed data structures.
> Although, if the caller is building something like an array, 
> there is likely to be a resizing and/or memory copying overhead.
> 
>> Does anyone have anything against adding these functions?
> 
> I don't think the convenience of the proposed functions
> is sufficient to require their implementation.
> 
> The agreed upon interface for the drmaa_get_next_* routines
> accept a buffer for the primary output value.
>  For example:
>    int drmaa_get_next_attr_name(drmaa_attr_names_t* values,
>                                 char *value, size_t value_len);
> Thus, the drmaa implementation is not required to retain 
> information once it has been provided to the application.
> 
> If it is important to iterate over values more than once,
> then presumably the DRMAA implementation should retain the
> entire set of values.  I'd prefer to let the application
> decide when to retain values.

So, you're belief is that an implementation may only fetch values as 
required so it doesn't have to store them all in memory?  I can see 
where that would be useful, but I don't see where that applies to DRMAA. 
  The things you get back as one of these structs are the job ids from a 
drmaa_run_bulk_jobs() call, attribute names from 
drmaa_get_attribute_names(), and values from 
drmaa_get_vector_attribute_value().  In all of these cases, the 
implementation has to store the full list of values internally anyway, 
unless it writes everything into a database so that it can read it out 
incrementally, and even then it still at some point had to have the 
whole thing in memory so it could write it out in the first place 
(except maybe the job ids...).  To me, this is a purely academic 
argument against a practical consideration.

I am in the peculiar position of both developing an implementation and 
using the implementation in applications.  I have now written two "real" 
DRMAA apps, and both times I ran into a problem with not knowing how 
many elements were being returned to me.  In the first app, I cheated 
and loaded the header that defined the structs so I could do the math 
myself.  In the second app, I just declared a really big array, which is 
both a memory waste and dangerous.  The correct answer would be to use a 
char** and re-malloc it if I get too many results, but that particular 
idiom is very error prone.  I don't see why a developer should be forced 
to do dangerous and redundant things if it's avoidable.  Besides, 
knowing how easy it would be to for the implementation to return the 
number of elements, it just seems sadistic to make developers jump 
through hoops for what should a simple operation.  C is complicated 
enough without adding thorny interfaces.

To be fair, the number of ids returned from drmaa_run_bulk_jobs() can be 
calculated from start, end, and incr, but I see it as needlessly 
dangerous to make the developer do math to guess what the API is going 
to do, especially when that math is related to memory allocation.  The 
less thinking the developer has to do, the fewer array overflows he's 
going to have.

This would all be different if we were talking about Java or C++ where 
there's a linked list implementation already there, just waiting to be 
used, but we're not.  This is C, and when you're developing with sticks 
and stones, even the slightest gimmee can mean a great deal.  Remember, 
adoption is important to us.

Daniel