[saga-rg] monitoring, again...

Andre Merzky andre at merzky.net
Sun Nov 6 19:48:50 CST 2005


Hi, 

as has been discussed recently, we want to have a simple way
to add callbacks to SAGA.  The main use case for this would
be for notification on status changes for tasks and jobs.

Attached is the re-surrected monitoring package as a
proposal to achieve this.  Please note that the package
mentiones steering at various places (mostly commented out)
- that is as of yet not part of the proposal, but should
seen as a possible path for future extension to steering.

For simplicity, a code example is included.  More feedback
to the topic is welcome.  I Cc to the GridRPC folx, as they
are probably interested in that.  

Note that the example monitors not the task status, but some
other metric provided by that implementation (bytes written).

Cheers, Andre.




+-------------------------------------------------------------+
Examples:

  monitoring example: monitor a write task 
  ----------------------------------------
  short version:
  --------------

    class write_metric_cb : public saga::metric::callback
    {
     public:
       void callback (saga::metric & m)
       {
         std::cout << atoi ( m.get_attribute ("value") ) << " ";
       }
    };
    
    int main (int argc, char** argv)
    {
      ssize_t     len = 0;
      std::string str ("Hello SAGA\n");
      std::string url (argv[1]);
    
      saga::file   f (url);
      saga::task   t = f.task.write (str, &len);
      saga::metric m = t.get_metric ("Written");

      write_metric_cb cb;
      m.add_callback (cb);

      t.wait ();
    }

  monitoring example: monitor a write task
  ----------------------------------------
  long annotated version:
  -----------------------

    // this example shows how monitoring a task can be
    // implemented
    class write_metric_cb : public saga::metric::callback
    {
     private:
       saga::task t;

     public:
       write_metric_cb (const saga::task & _t) { t = _t; }

       void callback   (saga::metric & m)
       {
         int len = atoi ( m.get_attribute ("value") );
         std::cout << "bytes written: " << len             << std::endl;
         std::cout << "task status:   " << t.get_status () << std::endl;
       }
    };
    
    int main (int argc, char** argv)
    {
      ssize_t     len = 0;
      std::string str ("Hello SAGA\n");
      std::string url (argv[1]);
    
      saga::file   f (url);
      saga::task   t = f.task.write (str, &len);
      saga::metric m = t.get_metric ("Written");

      // assume that this m is a discreet metric indicating 
      // the number of bytes already written.  In general, 
      // the list of metric names has to be searched for an 
      // interesting metric if name is not known.
    
      // add the callback
      write_metric_cb cb;
      m.add_callback (cb);

      // wait until task is done, and give cb chance to get
      // called a couple of times
      t.wait ();
    }
+-------------------------------------------------------------+

-- 
+-----------------------------------------------------------------+
| Andre Merzky                      | phon: +31 - 20 - 598 - 7759 |
| Vrije Universiteit Amsterdam (VU) | fax : +31 - 20 - 598 - 7653 |
| Dept. of Computer Science         | mail: merzky at cs.vu.nl       |
| De Boelelaan 1083a                | www:  http://www.merzky.net |
| 1081 HV Amsterdam, Netherlands    |                             |
+-----------------------------------------------------------------+
-------------- next part --------------


+-------------------------------------------------------------+

 #     #                                                       
 ##   ##  ###   #    #  # #####  ###   ####   #  #    #   ###
 # # # # #   #  ##   #  #   #   #   #  #   #  #  ##   #  #   #
 #  #  # #   #  # #  #  #   #   #   #  #   #  #  # #  #  #
 #     # #   #  #  # #  #   #   #   #  ####   #  #  # #  # ###
 #     # #   #  #   ##  #   #   #   #  #  #   #  #   ##  #   #
 #     #  ###   #    #  #   #    ###   #   #  #  #    #   ###

                             ##
                            #  #
                             ##
                            ###
                           #   # #
                           #    #
                            ###  #
   #####
  #     #  #####  ######  ######  #####   #  #    #   ####
  #          #    #       #       #    #  #  ##   #  #    #
   #####     #    #####   #####   #    #  #  # #  #  #
        #    #    #       #       #####   #  #  # #  #  ###
  #     #    #    #       #       #   #   #  #   ##  #    #
   #####     #    ######  ######  #    #  #  #    #   ####

+-------------------------------------------------------------+


Summary:
========

   The ability to query Grid entities about state is requested
   in several SAGA use cases.  Also, the SAGA Task model
   incorporates a certain amount of task monitoring.

   This package definition approaches the problem space of
   monitoring to unify the various usage patterns (see
   details), and to transparently incorporate SAGA task
   monitoring.

   A closely related topic is Steering, which is not really seen
   independently from Monitoring: in the SAGA approach, a future
   Steering may extend Monitoring by the ability to push values
   back to the Monitoring source.

+-------------------------------------------------------------+


Specification:
==============

  package SAGA version 0.1 {
 
    package Monitoring {

      class Metric implements-all Attribute {
        
        // need to be derived from for callbacks
        class CallBack {
          void callback (in Metric metric);
        }

        // add a callback, which gets active whenever 
        // the metric changes
        addCallBack      (in  CallBack         cb, 
                          out int              cookie);
        removeCallBack   (in  int              cb);

        // steering: update the metric value
        // update        (in  string           value);
      }
        
      interface Monitorable { 
        // introspection
        getNames         (out array<String,1>  names);

        // get hook for monitoring/steering
        getMetric        (in  String           name,
                          out Metric           metric); 

        // set hook for steering
        addMetric        (out Metric           metric); 
      }
    }
  } 

+-------------------------------------------------------------+

#ifndef SHORT

Details:
========

    class Metric

      The fundamenta object introduced in this package is a
      Metric.  A metric representas an descreet or continuous
      observable, which can be read, write, or read/writable.
      A readable observable corresponds to classical monitoring,
      a writable observable corresponds to steering.

      The approach is severely limited by the use of saga
      attributes for the description of a Metric, as these are
      only defined in terms of string keys and values.  An
      extension of the attribute definition by typed and complex
      values will greatly improve the usability of this package,
      but will also challenge its semantic simplicity.

      The metric MUST provide access to following attributes:

      "Name" :       short human readable name
                       "file.transfer.progress
      "Description": extensive human readable description
                       "This metric gives the status of an
                        ongoing file transfer as percent
                        completed."
      "Freq":        Discreet or Continuous
                       "CONTINUOUS"
      "Mode":        "Read", "Write", "ReadWrite" or "Final"
                       "Read"
      "Unit":        Unit of values
                       "percent (%)"
      "Type":        "String", "Int", "Float" etc.
                       "Float"
      "Value":       value of the metric
                       "20.5"

      The name of the metric must be unique, as the get_metric
      call needs to be able to identify the metric to return.

      A writable metric can be updated, and hence provides
      remote steering capabilities.  However, that is currently
      not part of the spec, but noted here as a path for future
      extension of the intertface.


+-------------------------------------------------------------+


Examples:
=========

  monitoring example: monitor a write task
  ----------------------------------------

    // this example shows how monitoring a task can be
    // implemented
    class write_metric_cb : public saga::metric::callback
    {
     private:
       saga::task t;

     public:
       write_metric_cb (const saga::task & _t) { t = _t; }

       void callback   (saga::metric & m)
       {
         int len = atoi ( m.get_attribute ("value") );
         std::cout << "bytes written: " << len             << std::endl;
         std::cout << "task status:   " << t.get_status () << std::endl;
       }
    };
    
    int main (int argc, char** argv)
    {
      ssize_t     len = 0;
      std::string str ("Hello SAGA\n");
      std::string url (argv[1]);
    
      saga::file   f (url);
      saga::task   t = f.task.write (str, &len);
      saga::metric m = t.get_metric ("Progress");

      // assume that this m is a discreet metric indicating 
      // the number of bytes already written.  In general, 
      // the list of metric names has to be searched for an 
      // interesting metric.
    
      // add the callback
      write_metric_callback cb (t);
      m.add_callback (cb);

      // wait until task is done, and give cb chance to get
      // called a couple of times
      t.wait ();
    }
    

  steering example: steer a remote job
  ------------------------------------

    // this example is concerned with steering (metric is writable).
    // However, Steering is currentlu NOT part of the SAGA spec.
    class application_observer_cb : public saga::metric::callback
    {
     private:
       saga::task t;

     public:
       void callback   (saga::metric & m)
       {
          int val = atoi ( m.get_attribute ("value") );
          std::cout << val << " is the new value." << std::endl;
       }
    };
    
    
    int main (int argc, char** argv)
    {
      saga::job_service js;

      saga::job j = js.run ("remote.host.net",
                            "my_remote_application");

      saga::metric m = job.get_metric ("param_1");
    
      // assume that m is a discreet metric representing
      // a integer parameter for the remote application.
      application_observer_cb cb;
      m.add_callback (cb);

      for ( int i = 0; i < 10; i++ )
      {
        m.update (std::string (i));
        // callback should get called NOW + latency
        // if param_1 is read only, update would return an error.
        sleep (1);
      }
    }
    


  steering example: BE a steerable job
  ------------------------------------

    // this example assumes that a job representing THIS application
    // (self) allows to add a metric.  However, that is    currently
    // NOT part of the SAGA spec.
    class application_observable_cb : public saga::metric::callback
    {
     private:
       int & i;

     public:
       application_observable_cb (int & _i) { i = _i; }
       void callback   (saga::metric & m)
       {
         int val = atoi ( m.get_attribute ("value") );
         i = val;
         std::cout << "new value: " << i << std::endl;
       }
    };
    
    int main (int argc, char** argv)
    {
      saga::job    j = theSession.get_self ();

      saga::metric m ("TestMetric", // name
                      "This is a test metric",
                                    // description
                      "Discreet",   // frequency
                      "ReadWrite",  // mode
                      "",           // no unit
                      "Int",        // type
                      "0");         // initial value

      j.add_metric (m);

      // a remote jobs can steer us for 100 seconds
      sleep (100);
    }
    

+---------------------------------------------------------------+

Notes:
======

  - possible deviation: allow only one CB per metric:
    no add/remove, but set/reset CB

+-------------------------------------------------------------+

#endif // SHORT



More information about the saga-rg mailing list