[gridrpc-wg] questions about function handles
Laurent Philippe
philippe at lifc.univ-fcomte.fr
Tue Jun 7 11:17:19 CDT 2005
Here is a short response to Hidemoto's questions :
Hidemoto Nakada wrote:
>>There two ways to initialize a function handle.
>>
>>- first, we can do it with grpc_function_handle_init. In this case, the
>> server is explicitly given by the client. As data location is known
>> (client, server or data repository), there no problem to bind the data
>> to the server. All data management can be done by the client:
>> placement, transfers or removal. The location will be given in the data
>> management functions.
>>
>>- second, the function handle could be initialized by
>> grpc_function_handle_default. In that case, the GridRPC API document
>> says: "This default could be a pre-determined server or it could be a
>> server that is dynamically chosen by the resource discovery mechanisms
>> of the underlying GridRPC implementation". Does that mean, the function
>> handle will contain a server reference in it after
>> grpc_function_handle_default call ? Or does that mean, the function
>> handle will reference a default discovery (or GridRPC server) while the
>> computational server will be chosen during grpc_call ?
>
>
> It is totally implementation dependent, I believe.
> In theory, you can get more chance to choose 'better' server,
> if you delayed selection of the server to the actual invocation time,
> since at that time you can get more information on the invocation,
> such as the size of data to be transferred.
Yes, you get more chance to choose a better server. But, that means
after calling the grpc_function_handle_default, you do not have a
reference to the server: it will be chosen later. You just have a
reference to the platform and you can not place any data before calling
the grpc_call since the server has not been selected. Please see the
example further.
>> If the function handle contains a reference to a server, then the data
>> management can be done in the same way as for
>> grpc_function_handle_init. If the function handle does not reference
>> the computational server, there no way to know where to place data
>> before issuing grpc_call. This is the way function handles are
>> implemented in Diet and Netsolve (2.0, any changes ?)
>> GridRPC interfaces.
>>
>>However, we should provide a way to dynamically choose a server...
>>Any comments ?
>
>
> I cannot understand your concern. Could you explain it giving
> examples ?
Actually, the problem is to decide if we always know where the
computation will take place or not. If we always know it, then we can
use standard copy function to put the data on the server before
computing (then the client is able to manage its data on its own, it
does not need more platform support than data handle management). If we
do not know where the computation will take place then we need platform
support: we need way to say to the platform that we want to leave this
data inside it, somewhere. This way could be a persistency flag or a
bind function, it do not matter, but we need it. After computation, the
server needs to know what to do with the data: send it back to the
client or leave it on its host?
This example is taken from an application running under DIET. This
application (kmc) simulate atoms deposition on a substrate. To better
see the result of the simulation, the data computed by the simulation
program are sent to a ray tracing service (povray). To optmize the
performances, we plan to deploy both applications on to differents PC
clusters (1 and 2) managed by DIET.
The GRPC client will do :
grpc_initialize();
grpc_function_handle_default( handleKmc, "kmc" );
// data preparation
grpc_call( handleKmc, data, &result );
// At the time of the grpc_call, the client does not know on which
// cluster it will execute. However, this is not very important as we
// just use input data for kmc.
grpc_function_handle_default( handlePovray, "povray" );
grpc_call( handlePovray, result, &image );
When the client will call povray, it will not know where its image will
be computed, which povray server will be used, on cluster 1 or 2. If the
client get the result back, there is no problem because it will use
result in the call. But, if we want to avoid useless transfers of
result, we need to leave the result data (persitent) inside the platform
and transfert it if the povray computation is not done on the same
cluster, when this server will be chosen. In that case, we need a way to
indicate to the kmc server that the result data must be leave on the
server and not returned to the client. However, before calling
(grpc_call) the kmc application, we do not know which cluster will be
used, so there is no way to inform it. Its not possible to bind the data
to this server, we can just bind it to the platform.
Is that example more clear for you ?
Laurent
--
Laurent PHILIPPE http://lifc.univ-fcomte.fr/~philippe
philippe at lifc.univ-fcomte.fr Laboratoire d'Informatique (LIFC)
tel: (33) 03 81 66 66 54 route de Gray
fax: (33) 03 81 66 64 50 25030 Besancon Cedex - FRANCE
More information about the gridrpc-wg
mailing list