[Pgi-wg] OGF OGSA-BES - Requirements for an improved Basic Execution Service

Etienne URBAH urbah at lal.in2p3.fr
Fri Sep 16 06:32:46 CDT 2011


Bernd and Andre,

Concerning the document named 'Requirements for an improved Basic 
Execution Service (BES)' and available at 
http://forge.gridforum.org/sf/go/doc16306 :

Thank you very much for your comments.


BES Client interface and BES operational requirements
-----------------------------------------------------
On Thu, 15/09/2011 20:43, Andre Merzky wrote:
> "The discussion about BES interface specification should be separated
>  from the discussion about interoperation of BES services."
I agree, and I will create a separate chapter gathering all BES 
operational requirements having NO direct impact on the BES Client 
interface (this will take a little time).


BES requirements which seem to be specific to gLite
---------------------------------------------------
If a requirement corresponds to NO real functional or operational need, 
then we should simply remove it.

If a requirement corresponds to a real functional or operational need, 
but is expressed in way too specific to gLite, then we should 
reformulate it in a more generic way.  Please provide suggestions.


RFC-3820-compliant X509 Proxies
-------------------------------
The RFC-3820-compliant X509 proxies are fully supported by the jLite 
library written in Java by Oleg SUKHOROSLOV and available at 
http://code.google.com/p/jlite/


Dependency of BES on other grid software components / operational issues
------------------------------------------------------------------------
It is very good that we know agree on GLUE 2.0 as base for the 
Information System.  Otherwise, we could NOT agree on the way to express 
references to grid entities in the Job Description document.

Your comments about chapter 6.5 confirm that the specifications of the 
BES Client interface DEPENDS on whether BES supports X509 proxies for 
delegation of Security credentials or NOT.

Since delegation of Security credentials is a MUST for BES, my 
conclusion is that we MUST agree on SECURITY issues (even if some are 
operational issues) BEFORE trying to write down BES requirements.

In the same way, we know that Clients need to perform complex queries on 
Jobs.  The BES Client interface DEPENDS on whether such queries are 
accepted by BES itself, of by a separate Logging and Bookkeeping 
service.  So, I think that we have to agree on the existence or absence 
of a separate Logging and Bookkeeping service BEFORE trying to write 
down BES requirements about queries.

As a summary :
-  Some BES requirements are quite independent from other grid 
components, and we can discuss on them immediately.
-  But some other BES requirements are DEPENDENT from foundational grid 
components or operational issues, in particular Information System, 
Security, Logging and Bookkeeping, ...
-  Therefore, we have to agree on these other grid components or 
operational issues FIRST.

This is a critical issue, and I propose that we discuss on it at OGF33.


I will answer to your other comments later.

Best regards.

-----------------------------------------------------
Etienne URBAH         LAL, Univ Paris-Sud, IN2P3/CNRS
                       Bat 200   91898 ORSAY    France
Tel: +33 1 64 46 84 87      Skype: etienne.urbah
Mob: +33 6 22 30 53 27      mailto:urbah at lal.in2p3.fr
-----------------------------------------------------


On Thu, 15/09/2011 22:44, Bernd Schuller wrote:
> Hi Etienne,
>
> thanks for the clarifications. So indeed your document is aimed at both:
>
> 1) providing requirements for the actual BES specification ("client
> interface" in your terminology)
> 2) the operation and deployment issues that have to be solved for
> interoperability on an infrastructure level (say EGI and EDGI).
>
> It would be very beneficial for further progress if these two distinct
> concerns could be separated, at least CLEARLY marked in your document.
>
> I have added some more comments inline.
>
> On Thu, 2011-09-15 at 19:27 +0200, Etienne URBAH wrote:
>
>> Concerning the document named 'Requirements for an improved Basic
>> Execution Service (BES)' and available at
>> http://forge.gridforum.org/sf/go/doc16306 :
>>
>> THANK YOU VERY MUCH for having taken the time to read this document, and
>> for having taken the time to provide comments :
>>
>> Such comments are very useful for the improvement of documents, permit
>> convergence and prepare later agreement.
>> [...]
>> On Thu, 15/09/2011 13:00, Bernd Schuller wrote:
>>>
>>> [...]
>>
>>> 1.4 Methodology
>>>
>>>    * ref 4: glite user guide ->   out of scope, gLite is too complex and
>>> too specific in its architecture (if there is such a thing) to be useful
>>> as a base for BES
>>     Yes, this is a very large user guide for specific usage of gLite by
>> users, and it does NOT provide a clear description of the architecture
>> and of the functionalities.
>>     But it is NOT so complex.  As soon as I finished reading this guide,
>> it was easy for me to perform reverse engineering and extract the
>> effective architecture (SOA with internal interfaces) and the implied
>> functionalities (which are to be improved).
>
> Indeed it appears that you try to impose gLite specifics (like a logging
> &  bookkeeping service or proxy certificates on the transport level) as
> requirements. This would severly limit the BES specification effort, and
> will not be accepted (I hope) by other stakeholders.
>
>>     In the text, I have prepended a few words to explain that.
>
> Basically you imply that the "architecture and functionalities" of the
> gLite execution system (together with the PGI work) is somehow the
> guideline to be followed, which I fully disagree with.
>
>>>
>>> 2.4 Collaboration with other services
>>>
>>> While this is important for interoperability, it is unimportant
>>> for the specification of a BES. The BES spec should NOT try to specify
>>> all the interaction with the rest of the world. This is the task of a
>>> "grid architecture specification" like OGSA.
>>     My document is NOT targeted only to the specification of the BES
>> Client interface, but to the clear and consistent description of BES
>> context and functional + operational requirements which are really
>> necessary for interoperability.
>>     As far as I know, OGSA does NOT take into account GLUE 2.0 yet.
>> Therefore, an up to date 'grid architecture specification' is absolutely
>> necessary.
>
> Glue2 is just an information model, not necessarily a perfect one nor
> the only one. However, I agree an information model has to be adopted
> for BES and any associated information systems.
>
>>     If OGSA members consider chapters 2.3 and 2.4 of my document as a
>> 'grid architecture specification' which updates and improves OGSA, I
>> thank them.  If they consider that this 'grid architecture
>> specification' does NOT comply with OGSA and competes with it, then I
>> assert that it obsoletes OGSA.
>
> I can't really say, the OGSA group stopped its work a long time ago and
> it's a long time that I looked at the documents.
>
>>>
>>> Specifically, the interactions with security, monitoring, accounting and
>>> logging framework are OPERATIONAL concerns that MUST NOT be a mandatory
>>> part of a BES specification.
>>     FAILURE of practical operations is often caused by LACK of early care
>> about operational concerns during specification phase.
>
> Agreed.
>
>>   As GIN-GC has
>> proven and documented, this is even more true for interoperability on
>> the field (as opposed to theoretical interoperability at the WSDL level).
>>     I confirm that care about operational concerns is REQUIRED for real
>> operations and for practical interoperability on the field.  Although
>> operational concerns are NOT part of the BES Client interface, they are
>> REQUIRED for the overall specifications of BES in its context.
>>     In the text, I have stressed that the document DOES take into account
>> operational concerns.
>>
>>>
>>> 4. BES non-functional requirements
>>>
>>> 4.1.2 Traceability - should be SHOULD not MUST
>>     This is an operational concern :  Would you really take the risk that
>> the whole EGI becomes a spambot or a scambot ?
>
> Isn't it already, powered by gLite and used by the wlcg botnet (just
> kidding of course) :-)
>
>>     No traceability -->  No post mortem analysis after attack -->  Large
>> infection -->  Panic -->  Abrupt and very long shutdown of all services.
>>     I fully confirm that traceability is a MUST.
>
> It is an internal detail which any good implementation will provide.
> If BES-A is much easier and more secure to operate than BES-B, admins
> can choose which to install.
>
>>>
>>> 4.1.3 Security
>>>    - how can a specification "implement" a policy? You probably
>>>      meant "BES implementations SHOULD ..."
>>     The text now is 'BES specifications MUST fully take into account the
>> Security Policies ...'
>
> Still no understanding here... let's take traceability
> The relevant EGI policy
> <https://documents.egi.eu/public/ShowDocument?docid=81>  says
>
> "[...] software deployed in the Grid MUST include the
> ability to produce sufficient and relevant logging [...]
> The level of the logging MUST be configured by all service providers,
> including but not limited to the Sites, to produce the required
> information which MUST be retained for a minimum of 90 days."
>
> For example all UNICORE services can be made to comply with this
> by configuration of the logging library we use (Apache Log4j), and by
> not deleting log files for 90 days.
> So this is a feature of the implementation and the administrator in
> charge, not the specification. Thus, your sentence should read "BES
> implementations SHOULD ..." (It is MUST of course only if they want to
> be deployed in EGI)
> One does not try to specify implementation details, at least not in any
> specification I've ever seen (e.g. does the HTTP specification say
> anything about server logging or accepted CAs?).
>
>
>
>> [...]
>>>
>>> 4.2
>>>    all of this is out of scope. For example the UNICORE service
>>> container hosts a number of services including an execution service.
>>> Probably you mean that the execution service SPECIFICATION should be
>>> limited to the execution service and MUST NOT specify accounting,
>>> security etc.
>>     'Well defined and narrow scope' is a general engineering requirement.
>>    It is fundamental concern crossing requirements, design,
>> specifications, fabrication, tests, operations, user experience and
>> product maintenance for all types of products, even outside software
>> engineering.
>
> exactly.
>
>>     I confirm that 'Well defined and narrow scope' is absolutely REQUIRED
>> for sound software design, implementation, deployment and maintainability.
>>     From your comment, I assume that the Execution Service of UNICORE
>> does have a 'Well defined and narrow scope', does have precise
>> interfaces with other UNICORE services, and minimizes overlaps with them.
>
> of course. And UNICORE does not include a L&B service :-)
>
>
>> [...]
>>> 6.1
>>>
>>>    * "SSL certificates MUST be signed by a CA..." this is an operational
>>> decision, and has nothing to do with the BES spec.
>>> For example, a site may run an inhouse deployment of BES using an
>>> in-house CA. This requirement should be deleted.
>>     This operational concern is REQUIRED for practical interoperability
>> on the field.  I have prepended :
>>     * Authentication of Servers :  The Execution Service SHOULD permit
>> Clients to authenticate it.  If an Execution Service authenticates
>> itself to clients, it MUST permit Clients to really perform this
>> authentication.
>
> This sentence makes no sense to me, sorry.
> Maybe "Server and client SHOULD communicate via a secure channel
> (SSL/TLS)". Even this may not be true in the future, though it is for
> all(?) the Grid systems currently.
>
>>>
>>> 6.3
>>>
>>> * "For Client authentication, the Execution Service MUST accept all
>>> following authentication methods:  Full X509,  RFC-3820-compliant X509
>>> Proxy"
>>>
>>> This requirement is invalid. I agree that it would be nice to be able to
>>> specify authentication methods, but it is impossible. For example
>>> Shibboleth, Username/password, OpenID, OAuth (e.g. for a REST interface
>>> over plain HTTP), or even NOTHING (e.g. in an inhouse grid) all can be
>>> valid authentication methods in some circumstances.
>>     There are 2 separate requirements :
>>     - 1 'MUST' for Full X509 and RFC-3820-compliant X509 Proxy
>>     - 1 'MAY'  for all other ones.
>>>
>>> Furthermore, making proxies a MUST implies that nonstandard
>>> authentication libraries instead of TLS/SSL must be used, making the BES
>>> implementation insecure. For some implementors (like UNICORE) proxies on
>>> the transport level are very much a no-go.
>>     I had clearly specified RFC-3820-compliant X509 Proxy, which ARE
>> standard.
>>     Your critics are valid for GSI proxies, which I have taken care NOT
>> to mention.
>
> by "standard" I did not mean that it is an RFC, but software support.
> As opposed to standard SSL/TLS, proxies are almost not supported by
> industry standard tools, for example Apache httpd or the Java JDK.
> One has to rely on custom code, which is notoriously buggy and error
> prone.
>
> Since one important non-functional requirement (for me at least) is to
> be able use standard (off-the-shelf) open source software, having to
> support proxies is a big limitation.
>
>>> 6.4.
>>>
>>> "This authorization mechanism MUST be consistent across all instances of
>>> the Execution Service"
>>>
>>> This violates the autonomy of a site. Site administrators often wish to
>>> stay in control of their resources, and do not accept external
>>> authorisation decision points. And anyway, who cares? Since the AuthZ
>>> mechanism is internal to the BES, it cannot be specified in the
>>> BES spec as such.
>>     Interoperability requires a federation of independent administrative
>> domain to agree on common functionalities, interfaces and operations.
>>     This DOES sometime violate the autonomy of each individual site.
>>     The requirement is NOT that the AUTHZ decision point is external to
>> any site, but that all participating site MUST accept to install inside
>> their site an instance of a commonly validated software implementing the
>> decision point.
>
> No. Each site may choose their own authz decision point, IMO.
>
>>     The AUTHZ mechanism MUST NOT be internal to the BES :
>
> Maybe I was not clear. The authz mechanism is invisible for outside
> parties (like clients). It can be an external component, an internal
> component, whatever, it is up to the BES implementor and the site admin.
>
>> For example,
>> in UNICORE atomic services, the 'de.fzj.unicore.uas.security' package
>> is described as 'The security subsystem of UNICORE/X', and is NOT
>> internal to 'de.fzj.unicore.uas.impl.job'.
>
> In UNICORE site admins can choose what attribute sources and XACML
> decision points they want to use, but the clients (including other
> services) do not need to know this. That is what I meant by "internal".
>
>>>
>>> 6.5
>>>
>>> These are reqiurements on the security layer (or framework) and should
>>> not be used as requirements on BES.
>>     These security requirements DO have impacts on the BES Client
>> interface and on the Job Description document.
>>     In the text, I have made it clear.
>
> indeed while preparing the EMI-ES specification, we came to the
> following conclusions
> 1) when using proxies for delegation, it is necessary to map each data
> staging item to a delegated credential (you can check the EMI-ES job
> description for details)
> 2) the delegation operations are separate from the job management
> operations, so they do not necessarily have to be part of the BES client
> interface.
>
> Also, there are existing implementations (UNICORE and Genesis come to my
> mind) that do not need this at all, because they do delegation without
> proxies.
>
> So I disagree, 6.5 mostly describes features of the particular security
> framework that is used.
>
>>>
>>> 8 BES requirements related to "Application Repositories"
>>>
>>> While I agree that BES should understand the notion of an "Application"
>>> (see e.g. JSDL ApplicationName), I don't agree that the BES should
>>> use these for Scheduling. Rather, this is the job of a broker.
>>     The text is now :
>>     * Resource selection :  The Execution Service MUST use, among others,
>> these references to 'Installed Applications' in order to select the most
>> adequate computing resource for the Job.
>>>
>>> 9 BES requirements applying to Accounting
>>>
>>> As a "MUST", these are out of scope, and should be made "SHOULD".
>>     No Accounting -->  No precise reporting to funding agencies -->  No
>> funding -->  Abrupt and very long shutdown of all services.
>>     I fully confirm that Accounting is a MUST.
>>
>
> ... operational
>
>>>
>>> 10 Logging/Bookkeeping
>>>
>>> Same as 9.
>>     Same as 'Traceability' :
>>     This is an operational concern :  Would you really take the risk that
>> the whole EGI becomes a spambot or a scambot ?
>>     No Logging and Bookkeeping -->  No post mortem analysis after attack
>> -->  Large infection -->  Panic -->  Abrupt and very long shutdown of all
>> services.
>>     I fully confirm that Logging and Bookkeeping is a MUST.
>>
>
> an operational MUST maybe for some infrastructures, not all.
>
> E.g. a typical HPC site has its own accounting, its own logging systems
> independent of the (Grid) software used to submit jobs to it.
>
>>>
>>> 12 Jobs
>>>
>>> 12.1 Types of job
>>>
>>> Support for parallel jobs: it should be "MUST" :-)
>>     The text is now :
>>     - The concept of 'Single Job' includes Jobs running
>> massively-parallel processes using MPI on one large-scale HPC System.
>> The Execution Service MUST understand instructions for usage of MPI
>> inside the Job Description document.  The Execution Service SHOULD
>> transmit these instructions to the Batch System, or return an explicit
>> error message if not supported.
>
> OK
>
>
>
> Best regards,
> Bernd.
>
>
> --
> Dr. Bernd Schuller
> Federated Systems and Data
> Juelich Supercomputing Centre, http://www.fz-juelich.de/jsc
> Phone: +49 246161-8736 (fax -8556)
>
>
>
>
> ------------------------------------------------------------------------------------------------
> ------------------------------------------------------------------------------------------------
> Forschungszentrum Juelich GmbH
> 52425 Juelich
> Sitz der Gesellschaft: Juelich
> Eingetragen im Handelsregister des Amtsgerichts Dueren Nr. HR B 3498
> Vorsitzender des Aufsichtsrats: MinDirig Dr. Karl Eugen Huthmacher
> Geschaeftsfuehrung: Prof. Dr. Achim Bachem (Vorsitzender),
> Karsten Beneke (stellv. Vorsitzender), Prof. Dr.-Ing. Harald Bolt,
> Prof. Dr. Sebastian M. Schmidt
> ------------------------------------------------------------------------------------------------
> ------------------------------------------------------------------------------------------------

-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/pkcs7-signature
Size: 3882 bytes
Desc: S/MIME Cryptographic Signature
Url : http://www.ogf.org/pipermail/pgi-wg/attachments/20110916/2ffe9969/attachment-0001.bin 


More information about the Pgi-wg mailing list