[saga-rg] monitoring, again...
Andre Merzky
andre at merzky.net
Sun Nov 6 19:48:50 CST 2005
Hi,
as has been discussed recently, we want to have a simple way
to add callbacks to SAGA. The main use case for this would
be for notification on status changes for tasks and jobs.
Attached is the re-surrected monitoring package as a
proposal to achieve this. Please note that the package
mentiones steering at various places (mostly commented out)
- that is as of yet not part of the proposal, but should
seen as a possible path for future extension to steering.
For simplicity, a code example is included. More feedback
to the topic is welcome. I Cc to the GridRPC folx, as they
are probably interested in that.
Note that the example monitors not the task status, but some
other metric provided by that implementation (bytes written).
Cheers, Andre.
+-------------------------------------------------------------+
Examples:
monitoring example: monitor a write task
----------------------------------------
short version:
--------------
class write_metric_cb : public saga::metric::callback
{
public:
void callback (saga::metric & m)
{
std::cout << atoi ( m.get_attribute ("value") ) << " ";
}
};
int main (int argc, char** argv)
{
ssize_t len = 0;
std::string str ("Hello SAGA\n");
std::string url (argv[1]);
saga::file f (url);
saga::task t = f.task.write (str, &len);
saga::metric m = t.get_metric ("Written");
write_metric_cb cb;
m.add_callback (cb);
t.wait ();
}
monitoring example: monitor a write task
----------------------------------------
long annotated version:
-----------------------
// this example shows how monitoring a task can be
// implemented
class write_metric_cb : public saga::metric::callback
{
private:
saga::task t;
public:
write_metric_cb (const saga::task & _t) { t = _t; }
void callback (saga::metric & m)
{
int len = atoi ( m.get_attribute ("value") );
std::cout << "bytes written: " << len << std::endl;
std::cout << "task status: " << t.get_status () << std::endl;
}
};
int main (int argc, char** argv)
{
ssize_t len = 0;
std::string str ("Hello SAGA\n");
std::string url (argv[1]);
saga::file f (url);
saga::task t = f.task.write (str, &len);
saga::metric m = t.get_metric ("Written");
// assume that this m is a discreet metric indicating
// the number of bytes already written. In general,
// the list of metric names has to be searched for an
// interesting metric if name is not known.
// add the callback
write_metric_cb cb;
m.add_callback (cb);
// wait until task is done, and give cb chance to get
// called a couple of times
t.wait ();
}
+-------------------------------------------------------------+
--
+-----------------------------------------------------------------+
| Andre Merzky | phon: +31 - 20 - 598 - 7759 |
| Vrije Universiteit Amsterdam (VU) | fax : +31 - 20 - 598 - 7653 |
| Dept. of Computer Science | mail: merzky at cs.vu.nl |
| De Boelelaan 1083a | www: http://www.merzky.net |
| 1081 HV Amsterdam, Netherlands | |
+-----------------------------------------------------------------+
-------------- next part --------------
+-------------------------------------------------------------+
# #
## ## ### # # # ##### ### #### # # # ###
# # # # # # ## # # # # # # # # ## # # #
# # # # # # # # # # # # # # # # # # #
# # # # # # # # # # # #### # # # # # ###
# # # # # ## # # # # # # # # ## # #
# # ### # # # # ### # # # # # ###
##
# #
##
###
# # #
# #
### #
#####
# # ##### ###### ###### ##### # # # ####
# # # # # # # ## # # #
##### # ##### ##### # # # # # # #
# # # # ##### # # # # # ###
# # # # # # # # # ## # #
##### # ###### ###### # # # # # ####
+-------------------------------------------------------------+
Summary:
========
The ability to query Grid entities about state is requested
in several SAGA use cases. Also, the SAGA Task model
incorporates a certain amount of task monitoring.
This package definition approaches the problem space of
monitoring to unify the various usage patterns (see
details), and to transparently incorporate SAGA task
monitoring.
A closely related topic is Steering, which is not really seen
independently from Monitoring: in the SAGA approach, a future
Steering may extend Monitoring by the ability to push values
back to the Monitoring source.
+-------------------------------------------------------------+
Specification:
==============
package SAGA version 0.1 {
package Monitoring {
class Metric implements-all Attribute {
// need to be derived from for callbacks
class CallBack {
void callback (in Metric metric);
}
// add a callback, which gets active whenever
// the metric changes
addCallBack (in CallBack cb,
out int cookie);
removeCallBack (in int cb);
// steering: update the metric value
// update (in string value);
}
interface Monitorable {
// introspection
getNames (out array<String,1> names);
// get hook for monitoring/steering
getMetric (in String name,
out Metric metric);
// set hook for steering
addMetric (out Metric metric);
}
}
}
+-------------------------------------------------------------+
#ifndef SHORT
Details:
========
class Metric
The fundamenta object introduced in this package is a
Metric. A metric representas an descreet or continuous
observable, which can be read, write, or read/writable.
A readable observable corresponds to classical monitoring,
a writable observable corresponds to steering.
The approach is severely limited by the use of saga
attributes for the description of a Metric, as these are
only defined in terms of string keys and values. An
extension of the attribute definition by typed and complex
values will greatly improve the usability of this package,
but will also challenge its semantic simplicity.
The metric MUST provide access to following attributes:
"Name" : short human readable name
"file.transfer.progress
"Description": extensive human readable description
"This metric gives the status of an
ongoing file transfer as percent
completed."
"Freq": Discreet or Continuous
"CONTINUOUS"
"Mode": "Read", "Write", "ReadWrite" or "Final"
"Read"
"Unit": Unit of values
"percent (%)"
"Type": "String", "Int", "Float" etc.
"Float"
"Value": value of the metric
"20.5"
The name of the metric must be unique, as the get_metric
call needs to be able to identify the metric to return.
A writable metric can be updated, and hence provides
remote steering capabilities. However, that is currently
not part of the spec, but noted here as a path for future
extension of the intertface.
+-------------------------------------------------------------+
Examples:
=========
monitoring example: monitor a write task
----------------------------------------
// this example shows how monitoring a task can be
// implemented
class write_metric_cb : public saga::metric::callback
{
private:
saga::task t;
public:
write_metric_cb (const saga::task & _t) { t = _t; }
void callback (saga::metric & m)
{
int len = atoi ( m.get_attribute ("value") );
std::cout << "bytes written: " << len << std::endl;
std::cout << "task status: " << t.get_status () << std::endl;
}
};
int main (int argc, char** argv)
{
ssize_t len = 0;
std::string str ("Hello SAGA\n");
std::string url (argv[1]);
saga::file f (url);
saga::task t = f.task.write (str, &len);
saga::metric m = t.get_metric ("Progress");
// assume that this m is a discreet metric indicating
// the number of bytes already written. In general,
// the list of metric names has to be searched for an
// interesting metric.
// add the callback
write_metric_callback cb (t);
m.add_callback (cb);
// wait until task is done, and give cb chance to get
// called a couple of times
t.wait ();
}
steering example: steer a remote job
------------------------------------
// this example is concerned with steering (metric is writable).
// However, Steering is currentlu NOT part of the SAGA spec.
class application_observer_cb : public saga::metric::callback
{
private:
saga::task t;
public:
void callback (saga::metric & m)
{
int val = atoi ( m.get_attribute ("value") );
std::cout << val << " is the new value." << std::endl;
}
};
int main (int argc, char** argv)
{
saga::job_service js;
saga::job j = js.run ("remote.host.net",
"my_remote_application");
saga::metric m = job.get_metric ("param_1");
// assume that m is a discreet metric representing
// a integer parameter for the remote application.
application_observer_cb cb;
m.add_callback (cb);
for ( int i = 0; i < 10; i++ )
{
m.update (std::string (i));
// callback should get called NOW + latency
// if param_1 is read only, update would return an error.
sleep (1);
}
}
steering example: BE a steerable job
------------------------------------
// this example assumes that a job representing THIS application
// (self) allows to add a metric. However, that is currently
// NOT part of the SAGA spec.
class application_observable_cb : public saga::metric::callback
{
private:
int & i;
public:
application_observable_cb (int & _i) { i = _i; }
void callback (saga::metric & m)
{
int val = atoi ( m.get_attribute ("value") );
i = val;
std::cout << "new value: " << i << std::endl;
}
};
int main (int argc, char** argv)
{
saga::job j = theSession.get_self ();
saga::metric m ("TestMetric", // name
"This is a test metric",
// description
"Discreet", // frequency
"ReadWrite", // mode
"", // no unit
"Int", // type
"0"); // initial value
j.add_metric (m);
// a remote jobs can steer us for 100 seconds
sleep (100);
}
+---------------------------------------------------------------+
Notes:
======
- possible deviation: allow only one CB per metric:
no add/remove, but set/reset CB
+-------------------------------------------------------------+
#endif // SHORT
More information about the saga-rg
mailing list