Subject: Re: [Fwd: [saga-rg] monitoring, again...]

Shantenu Jha s.jha at ucl.ac.uk
Mon Nov 7 12:00:31 CST 2005


From: <rsirvent at ac.upc.edu>
To: saga-rg at ggf.org
Cc: "Rosa M. Badia" <rosab at ac.upc.edu>
Subject: Re: [Fwd: [saga-rg] monitoring, again...]

References: <436F1D90.9020305 at ac.upc.edu>

Hi. I would like to give our opinion (from GRID superscalar's point of 
view, as a use case of SAGA) about this monitoring API.

The way it should be used is almost clear to me (it is very similar to 
the one defined in GAT). What I find missing is a way of waiting for 
callbacks to arrive, without having to wait for the task to finish. Of 
course I'm thinking in a Globus fashion (calls like globus_poll and 
globus_poll_blocking). I remember something similar in GAT named.. 
ServiceActions? Its purpose was to give the CPU to the GAT environment 
in order to do this kind of things (executing the functions to handle de 
callbacks, for instance). Is there something similar in SAGA?

Cheers. Raul.

Rosa M. Badia wrote:

> Raul,
>
> em sembla que aixo te a veure amb els callbacks. Li dones una ullada?
>
> Rosa
>
>
> ------------------------------------------------------------------------
>
> Subject:
> [saga-rg] monitoring, again...
> From:
> Andre Merzky <andre at merzky.net>
> Date:
> Sun, 6 Nov 2005 19:48:50 -0600
> To:
> Simple API for Grid Applications WG <saga-rg at ggf.org>
>
> To:
> Simple API for Grid Applications WG <saga-rg at ggf.org>
>
>
>Hi, 
>
>as has been discussed recently, we want to have a simple way
>to add callbacks to SAGA.  The main use case for this would
>be for notification on status changes for tasks and jobs.
>
>Attached is the re-surrected monitoring package as a
>proposal to achieve this.  Please note that the package
>mentiones steering at various places (mostly commented out)
>- that is as of yet not part of the proposal, but should
>seen as a possible path for future extension to steering.
>
>For simplicity, a code example is included.  More feedback
>to the topic is welcome.  I Cc to the GridRPC folx, as they
>are probably interested in that.  
>
>Note that the example monitors not the task status, but some
>other metric provided by that implementation (bytes written).
>
>Cheers, Andre.
>
>
>
>
>+-------------------------------------------------------------+
>Examples:
>
>  monitoring example: monitor a write task 
>  ----------------------------------------
>  short version:
>  --------------
>
>    class write_metric_cb : public saga::metric::callback
>    {
>     public:
>       void callback (saga::metric & m)
>       {
>         std::cout << atoi ( m.get_attribute ("value") ) << " ";
>       }
>    };
>    
>    int main (int argc, char** argv)
>    {
>      ssize_t     len = 0;
>      std::string str ("Hello SAGA\n");
>      std::string url (argv[1]);
>    
>      saga::file   f (url);
>      saga::task   t = f.task.write (str, &len);
>      saga::metric m = t.get_metric ("Written");
>
>      write_metric_cb cb;
>      m.add_callback (cb);
>
>      t.wait ();
>    }
>
>  monitoring example: monitor a write task
>  ----------------------------------------
>  long annotated version:
>  -----------------------
>
>    // this example shows how monitoring a task can be
>    // implemented
>    class write_metric_cb : public saga::metric::callback
>    {
>     private:
>       saga::task t;
>
>     public:
>       write_metric_cb (const saga::task & _t) { t = _t; }
>
>       void callback   (saga::metric & m)
>       {
>         int len = atoi ( m.get_attribute ("value") );
>         std::cout << "bytes written: " << len             << std::endl;
>         std::cout << "task status:   " << t.get_status () << std::endl;
>       }
>    };
>    
>    int main (int argc, char** argv)
>    {
>      ssize_t     len = 0;
>      std::string str ("Hello SAGA\n");
>      std::string url (argv[1]);
>    
>      saga::file   f (url);
>      saga::task   t = f.task.write (str, &len);
>      saga::metric m = t.get_metric ("Written");
>
>      // assume that this m is a discreet metric indicating 
>      // the number of bytes already written.  In general, 
>      // the list of metric names has to be searched for an 
>      // interesting metric if name is not known.
>    
>      // add the callback
>      write_metric_cb cb;
>      m.add_callback (cb);
>
>      // wait until task is done, and give cb chance to get
>      // called a couple of times
>      t.wait ();
>    }
>+-------------------------------------------------------------+
>
>  
>
>------------------------------------------------------------------------
>
>
>
>+-------------------------------------------------------------+
>
> #     #                                                       
> ##   ##  ###   #    #  # #####  ###   ####   #  #    #   ###
> # # # # #   #  ##   #  #   #   #   #  #   #  #  ##   #  #   #
> #  #  # #   #  # #  #  #   #   #   #  #   #  #  # #  #  #
> #     # #   #  #  # #  #   #   #   #  ####   #  #  # #  # ###
> #     # #   #  #   ##  #   #   #   #  #  #   #  #   ##  #   #
> #     #  ###   #    #  #   #    ###   #   #  #  #    #   ###
>
>                             ##
>                            #  #
>                             ##
>                            ###
>                           #   # #
>                           #    #
>                            ###  #
>   #####
>  #     #  #####  ######  ######  #####   #  #    #   ####
>  #          #    #       #       #    #  #  ##   #  #    #
>   #####     #    #####   #####   #    #  #  # #  #  #
>        #    #    #       #       #####   #  #  # #  #  ###
>  #     #    #    #       #       #   #   #  #   ##  #    #
>   #####     #    ######  ######  #    #  #  #    #   ####
>
>+-------------------------------------------------------------+
>
>
>Summary:
>========
>
>   The ability to query Grid entities about state is requested
>   in several SAGA use cases.  Also, the SAGA Task model
>   incorporates a certain amount of task monitoring.
>
>   This package definition approaches the problem space of
>   monitoring to unify the various usage patterns (see
>   details), and to transparently incorporate SAGA task
>   monitoring.
>
>   A closely related topic is Steering, which is not really seen
>   independently from Monitoring: in the SAGA approach, a future
>   Steering may extend Monitoring by the ability to push values
>   back to the Monitoring source.
>
>+-------------------------------------------------------------+
>
>
>Specification:
>==============
>
>  package SAGA version 0.1 {
> 
>    package Monitoring {
>
>      class Metric implements-all Attribute {
>        
>        // need to be derived from for callbacks
>        class CallBack {
>          void callback (in Metric metric);
>        }
>
>        // add a callback, which gets active whenever 
>        // the metric changes
>        addCallBack      (in  CallBack         cb, 
>                          out int              cookie);
>        removeCallBack   (in  int              cb);
>
>        // steering: update the metric value
>        // update        (in  string           value);
>      }
>        
>      interface Monitorable { 
>        // introspection
>        getNames         (out array<String,1>  names);
>
>        // get hook for monitoring/steering
>        getMetric        (in  String           name,
>                          out Metric           metric); 
>
>        // set hook for steering
>        addMetric        (out Metric           metric); 
>      }
>    }
>  } 
>
>+-------------------------------------------------------------+
>
>#ifndef SHORT
>
>Details:
>========
>
>    class Metric
>
>      The fundamenta object introduced in this package is a
>      Metric.  A metric representas an descreet or continuous
>      observable, which can be read, write, or read/writable.
>      A readable observable corresponds to classical monitoring,
>      a writable observable corresponds to steering.
>
>      The approach is severely limited by the use of saga
>      attributes for the description of a Metric, as these are
>      only defined in terms of string keys and values.  An
>      extension of the attribute definition by typed and complex
>      values will greatly improve the usability of this package,
>      but will also challenge its semantic simplicity.
>
>      The metric MUST provide access to following attributes:
>
>      "Name" :       short human readable name
>                       "file.transfer.progress
>      "Description": extensive human readable description
>                       "This metric gives the status of an
>                        ongoing file transfer as percent
>                        completed."
>      "Freq":        Discreet or Continuous
>                       "CONTINUOUS"
>      "Mode":        "Read", "Write", "ReadWrite" or "Final"
>                       "Read"
>      "Unit":        Unit of values
>                       "percent (%)"
>      "Type":        "String", "Int", "Float" etc.
>                       "Float"
>      "Value":       value of the metric
>                       "20.5"
>
>      The name of the metric must be unique, as the get_metric
>      call needs to be able to identify the metric to return.
>
>      A writable metric can be updated, and hence provides
>      remote steering capabilities.  However, that is currently
>      not part of the spec, but noted here as a path for future
>      extension of the intertface.
>
>
>+-------------------------------------------------------------+
>
>
>Examples:
>=========
>
>  monitoring example: monitor a write task
>  ----------------------------------------
>
>    // this example shows how monitoring a task can be
>    // implemented
>    class write_metric_cb : public saga::metric::callback
>    {
>     private:
>       saga::task t;
>
>     public:
>       write_metric_cb (const saga::task & _t) { t = _t; }
>
>       void callback   (saga::metric & m)
>       {
>         int len = atoi ( m.get_attribute ("value") );
>         std::cout << "bytes written: " << len             << std::endl;
>         std::cout << "task status:   " << t.get_status () << std::endl;
>       }
>    };
>    
>    int main (int argc, char** argv)
>    {
>      ssize_t     len = 0;
>      std::string str ("Hello SAGA\n");
>      std::string url (argv[1]);
>    
>      saga::file   f (url);
>      saga::task   t = f.task.write (str, &len);
>      saga::metric m = t.get_metric ("Progress");
>
>      // assume that this m is a discreet metric indicating 
>      // the number of bytes already written.  In general, 
>      // the list of metric names has to be searched for an 
>      // interesting metric.
>    
>      // add the callback
>      write_metric_callback cb (t);
>      m.add_callback (cb);
>
>      // wait until task is done, and give cb chance to get
>      // called a couple of times
>      t.wait ();
>    }
>    
>
>  steering example: steer a remote job
>  ------------------------------------
>
>    // this example is concerned with steering (metric is writable).
>    // However, Steering is currentlu NOT part of the SAGA spec.
>    class application_observer_cb : public saga::metric::callback
>    {
>     private:
>       saga::task t;
>
>     public:
>       void callback   (saga::metric & m)
>       {
>          int val = atoi ( m.get_attribute ("value") );
>          std::cout << val << " is the new value." << std::endl;
>       }
>    };
>    
>    
>    int main (int argc, char** argv)
>    {
>      saga::job_service js;
>
>      saga::job j = js.run ("remote.host.net",
>                            "my_remote_application");
>
>      saga::metric m = job.get_metric ("param_1");
>    
>      // assume that m is a discreet metric representing
>      // a integer parameter for the remote application.
>      application_observer_cb cb;
>      m.add_callback (cb);
>
>      for ( int i = 0; i < 10; i++ )
>      {
>        m.update (std::string (i));
>        // callback should get called NOW + latency
>        // if param_1 is read only, update would return an error.
>        sleep (1);
>      }
>    }
>    
>
>
>  steering example: BE a steerable job
>  ------------------------------------
>
>    // this example assumes that a job representing THIS application
>    // (self) allows to add a metric.  However, that is    currently
>    // NOT part of the SAGA spec.
>    class application_observable_cb : public saga::metric::callback
>    {
>     private:
>       int & i;
>
>     public:
>       application_observable_cb (int & _i) { i = _i; }
>       void callback   (saga::metric & m)
>       {
>         int val = atoi ( m.get_attribute ("value") );
>         i = val;
>         std::cout << "new value: " << i << std::endl;
>       }
>    };
>    
>    int main (int argc, char** argv)
>    {
>      saga::job    j = theSession.get_self ();
>
>      saga::metric m ("TestMetric", // name
>                      "This is a test metric",
>                                    // description
>                      "Discreet",   // frequency
>                      "ReadWrite",  // mode
>                      "",           // no unit
>                      "Int",        // type
>                      "0");         // initial value
>
>      j.add_metric (m);
>
>      // a remote jobs can steer us for 100 seconds
>      sleep (100);
>    }
>    
>
>+---------------------------------------------------------------+
>
>Notes:
>======
>
>  - possible deviation: allow only one CB per metric:
>    no add/remove, but set/reset CB
>
>+-------------------------------------------------------------+
>
>#endif // SHORT
>
>  
>







More information about the saga-rg mailing list