[Pgi-wg] OGF OGSA-BES - Requirements for an improved Basic Execution Service
Bernd Schuller
b.schuller at fz-juelich.de
Fri Sep 16 08:26:30 CDT 2011
hi Etienne,
On Fri, 2011-09-16 at 13:32 +0200, Etienne URBAH wrote:
[...]
>
> RFC-3820-compliant X509 Proxies
> -------------------------------
> The RFC-3820-compliant X509 proxies are fully supported by the jLite
> library written in Java by Oleg SUKHOROSLOV and available at
> http://code.google.com/p/jlite/
>
You must be joking. I was talking about open source software like Apache
httpd and Java JDK. jLite just wraps the cog-globus libraries and adds
some gLite access APIs. No, thanks.
>
> Dependency of BES on other grid software components / operational issues
> ------------------------------------------------------------------------
> It is very good that we know agree on GLUE 2.0 as base for the
> Information System. Otherwise, we could NOT agree on the way to express
> references to grid entities in the Job Description document.
>
> Your comments about chapter 6.5 confirm that the specifications of the
> BES Client interface DEPENDS on whether BES supports X509 proxies for
> delegation of Security credentials or NOT.
>
At least this was the EMI-ES v1.0 conclusion, which need not be the
final word. The only dependency (in EMI-ES) is the specification which
delegated credential is to be used for which data staging item.
Delegation can be performed FULLY TRANSPARENT to the BES (for example on
a message level as in UNICORE), and the BES interface specification is
not dependent on it at all.
> Since delegation of Security credentials is a MUST for BES, my
> conclusion is that we MUST agree on SECURITY issues (even if some are
> operational issues) BEFORE trying to write down BES requirements.
>
> In the same way, we know that Clients need to perform complex queries on
> Jobs. The BES Client interface DEPENDS on whether such queries are
> accepted by BES itself, of by a separate Logging and Bookkeeping
> service. So, I think that we have to agree on the existence or absence
> of a separate Logging and Bookkeeping service BEFORE trying to write
> down BES requirements about queries.
The BES client interface does not necessarily need to allow to perform
complex queries, these should be part of a separate interface.
> As a summary :
> - Some BES requirements are quite independent from other grid
> components, and we can discuss on them immediately.
> - But some other BES requirements are DEPENDENT from foundational grid
> components or operational issues, in particular Information System,
> Security, Logging and Bookkeeping, ...
> - Therefore, we have to agree on these other grid components or
> operational issues FIRST.
>
> This is a critical issue, and I propose that we discuss on it at OGF33.
Unfortunately I won't be in Lyon, only via phone.
Summarising, my main points for the BES interface specification :
* specifications must be narrowly scoped and composable (which is the
JSDL/BES model anyway).
* do not try to specify specific authentication methods. Do not try to
specify specific delegation methods. Security is a cross cutting concern
and should be dealt with separately.
* do not assume a special environment where a BES instance will run.
Interactions with other services (except for data access) are optional
and should be treated as such.
* leave operational aspects to operations. Recommendations for BES
implementors may be given, of course.
Best regards,
Bernd.
>
> I will answer to your other comments later.
>
> Best regards.
>
> -----------------------------------------------------
> Etienne URBAH LAL, Univ Paris-Sud, IN2P3/CNRS
> Bat 200 91898 ORSAY France
> Tel: +33 1 64 46 84 87 Skype: etienne.urbah
> Mob: +33 6 22 30 53 27 mailto:urbah at lal.in2p3.fr
> -----------------------------------------------------
>
>
> On Thu, 15/09/2011 22:44, Bernd Schuller wrote:
> > Hi Etienne,
> >
> > thanks for the clarifications. So indeed your document is aimed at both:
> >
> > 1) providing requirements for the actual BES specification ("client
> > interface" in your terminology)
> > 2) the operation and deployment issues that have to be solved for
> > interoperability on an infrastructure level (say EGI and EDGI).
> >
> > It would be very beneficial for further progress if these two distinct
> > concerns could be separated, at least CLEARLY marked in your document.
> >
> > I have added some more comments inline.
> >
> > On Thu, 2011-09-15 at 19:27 +0200, Etienne URBAH wrote:
> >
> >> Concerning the document named 'Requirements for an improved Basic
> >> Execution Service (BES)' and available at
> >> http://forge.gridforum.org/sf/go/doc16306 :
> >>
> >> THANK YOU VERY MUCH for having taken the time to read this document, and
> >> for having taken the time to provide comments :
> >>
> >> Such comments are very useful for the improvement of documents, permit
> >> convergence and prepare later agreement.
> >> [...]
> >> On Thu, 15/09/2011 13:00, Bernd Schuller wrote:
> >>>
> >>> [...]
> >>
> >>> 1.4 Methodology
> >>>
> >>> * ref 4: glite user guide -> out of scope, gLite is too complex and
> >>> too specific in its architecture (if there is such a thing) to be useful
> >>> as a base for BES
> >> Yes, this is a very large user guide for specific usage of gLite by
> >> users, and it does NOT provide a clear description of the architecture
> >> and of the functionalities.
> >> But it is NOT so complex. As soon as I finished reading this guide,
> >> it was easy for me to perform reverse engineering and extract the
> >> effective architecture (SOA with internal interfaces) and the implied
> >> functionalities (which are to be improved).
> >
> > Indeed it appears that you try to impose gLite specifics (like a logging
> > & bookkeeping service or proxy certificates on the transport level) as
> > requirements. This would severly limit the BES specification effort, and
> > will not be accepted (I hope) by other stakeholders.
> >
> >> In the text, I have prepended a few words to explain that.
> >
> > Basically you imply that the "architecture and functionalities" of the
> > gLite execution system (together with the PGI work) is somehow the
> > guideline to be followed, which I fully disagree with.
> >
> >>>
> >>> 2.4 Collaboration with other services
> >>>
> >>> While this is important for interoperability, it is unimportant
> >>> for the specification of a BES. The BES spec should NOT try to specify
> >>> all the interaction with the rest of the world. This is the task of a
> >>> "grid architecture specification" like OGSA.
> >> My document is NOT targeted only to the specification of the BES
> >> Client interface, but to the clear and consistent description of BES
> >> context and functional + operational requirements which are really
> >> necessary for interoperability.
> >> As far as I know, OGSA does NOT take into account GLUE 2.0 yet.
> >> Therefore, an up to date 'grid architecture specification' is absolutely
> >> necessary.
> >
> > Glue2 is just an information model, not necessarily a perfect one nor
> > the only one. However, I agree an information model has to be adopted
> > for BES and any associated information systems.
> >
> >> If OGSA members consider chapters 2.3 and 2.4 of my document as a
> >> 'grid architecture specification' which updates and improves OGSA, I
> >> thank them. If they consider that this 'grid architecture
> >> specification' does NOT comply with OGSA and competes with it, then I
> >> assert that it obsoletes OGSA.
> >
> > I can't really say, the OGSA group stopped its work a long time ago and
> > it's a long time that I looked at the documents.
> >
> >>>
> >>> Specifically, the interactions with security, monitoring, accounting and
> >>> logging framework are OPERATIONAL concerns that MUST NOT be a mandatory
> >>> part of a BES specification.
> >> FAILURE of practical operations is often caused by LACK of early care
> >> about operational concerns during specification phase.
> >
> > Agreed.
> >
> >> As GIN-GC has
> >> proven and documented, this is even more true for interoperability on
> >> the field (as opposed to theoretical interoperability at the WSDL level).
> >> I confirm that care about operational concerns is REQUIRED for real
> >> operations and for practical interoperability on the field. Although
> >> operational concerns are NOT part of the BES Client interface, they are
> >> REQUIRED for the overall specifications of BES in its context.
> >> In the text, I have stressed that the document DOES take into account
> >> operational concerns.
> >>
> >>>
> >>> 4. BES non-functional requirements
> >>>
> >>> 4.1.2 Traceability - should be SHOULD not MUST
> >> This is an operational concern : Would you really take the risk that
> >> the whole EGI becomes a spambot or a scambot ?
> >
> > Isn't it already, powered by gLite and used by the wlcg botnet (just
> > kidding of course) :-)
> >
> >> No traceability --> No post mortem analysis after attack --> Large
> >> infection --> Panic --> Abrupt and very long shutdown of all services.
> >> I fully confirm that traceability is a MUST.
> >
> > It is an internal detail which any good implementation will provide.
> > If BES-A is much easier and more secure to operate than BES-B, admins
> > can choose which to install.
> >
> >>>
> >>> 4.1.3 Security
> >>> - how can a specification "implement" a policy? You probably
> >>> meant "BES implementations SHOULD ..."
> >> The text now is 'BES specifications MUST fully take into account the
> >> Security Policies ...'
> >
> > Still no understanding here... let's take traceability
> > The relevant EGI policy
> > <https://documents.egi.eu/public/ShowDocument?docid=81> says
> >
> > "[...] software deployed in the Grid MUST include the
> > ability to produce sufficient and relevant logging [...]
> > The level of the logging MUST be configured by all service providers,
> > including but not limited to the Sites, to produce the required
> > information which MUST be retained for a minimum of 90 days."
> >
> > For example all UNICORE services can be made to comply with this
> > by configuration of the logging library we use (Apache Log4j), and by
> > not deleting log files for 90 days.
> > So this is a feature of the implementation and the administrator in
> > charge, not the specification. Thus, your sentence should read "BES
> > implementations SHOULD ..." (It is MUST of course only if they want to
> > be deployed in EGI)
> > One does not try to specify implementation details, at least not in any
> > specification I've ever seen (e.g. does the HTTP specification say
> > anything about server logging or accepted CAs?).
> >
> >
> >
> >> [...]
> >>>
> >>> 4.2
> >>> all of this is out of scope. For example the UNICORE service
> >>> container hosts a number of services including an execution service.
> >>> Probably you mean that the execution service SPECIFICATION should be
> >>> limited to the execution service and MUST NOT specify accounting,
> >>> security etc.
> >> 'Well defined and narrow scope' is a general engineering requirement.
> >> It is fundamental concern crossing requirements, design,
> >> specifications, fabrication, tests, operations, user experience and
> >> product maintenance for all types of products, even outside software
> >> engineering.
> >
> > exactly.
> >
> >> I confirm that 'Well defined and narrow scope' is absolutely REQUIRED
> >> for sound software design, implementation, deployment and maintainability.
> >> From your comment, I assume that the Execution Service of UNICORE
> >> does have a 'Well defined and narrow scope', does have precise
> >> interfaces with other UNICORE services, and minimizes overlaps with them.
> >
> > of course. And UNICORE does not include a L&B service :-)
> >
> >
> >> [...]
> >>> 6.1
> >>>
> >>> * "SSL certificates MUST be signed by a CA..." this is an operational
> >>> decision, and has nothing to do with the BES spec.
> >>> For example, a site may run an inhouse deployment of BES using an
> >>> in-house CA. This requirement should be deleted.
> >> This operational concern is REQUIRED for practical interoperability
> >> on the field. I have prepended :
> >> * Authentication of Servers : The Execution Service SHOULD permit
> >> Clients to authenticate it. If an Execution Service authenticates
> >> itself to clients, it MUST permit Clients to really perform this
> >> authentication.
> >
> > This sentence makes no sense to me, sorry.
> > Maybe "Server and client SHOULD communicate via a secure channel
> > (SSL/TLS)". Even this may not be true in the future, though it is for
> > all(?) the Grid systems currently.
> >
> >>>
> >>> 6.3
> >>>
> >>> * "For Client authentication, the Execution Service MUST accept all
> >>> following authentication methods: Full X509, RFC-3820-compliant X509
> >>> Proxy"
> >>>
> >>> This requirement is invalid. I agree that it would be nice to be able to
> >>> specify authentication methods, but it is impossible. For example
> >>> Shibboleth, Username/password, OpenID, OAuth (e.g. for a REST interface
> >>> over plain HTTP), or even NOTHING (e.g. in an inhouse grid) all can be
> >>> valid authentication methods in some circumstances.
> >> There are 2 separate requirements :
> >> - 1 'MUST' for Full X509 and RFC-3820-compliant X509 Proxy
> >> - 1 'MAY' for all other ones.
> >>>
> >>> Furthermore, making proxies a MUST implies that nonstandard
> >>> authentication libraries instead of TLS/SSL must be used, making the BES
> >>> implementation insecure. For some implementors (like UNICORE) proxies on
> >>> the transport level are very much a no-go.
> >> I had clearly specified RFC-3820-compliant X509 Proxy, which ARE
> >> standard.
> >> Your critics are valid for GSI proxies, which I have taken care NOT
> >> to mention.
> >
> > by "standard" I did not mean that it is an RFC, but software support.
> > As opposed to standard SSL/TLS, proxies are almost not supported by
> > industry standard tools, for example Apache httpd or the Java JDK.
> > One has to rely on custom code, which is notoriously buggy and error
> > prone.
> >
> > Since one important non-functional requirement (for me at least) is to
> > be able use standard (off-the-shelf) open source software, having to
> > support proxies is a big limitation.
> >
> >>> 6.4.
> >>>
> >>> "This authorization mechanism MUST be consistent across all instances of
> >>> the Execution Service"
> >>>
> >>> This violates the autonomy of a site. Site administrators often wish to
> >>> stay in control of their resources, and do not accept external
> >>> authorisation decision points. And anyway, who cares? Since the AuthZ
> >>> mechanism is internal to the BES, it cannot be specified in the
> >>> BES spec as such.
> >> Interoperability requires a federation of independent administrative
> >> domain to agree on common functionalities, interfaces and operations.
> >> This DOES sometime violate the autonomy of each individual site.
> >> The requirement is NOT that the AUTHZ decision point is external to
> >> any site, but that all participating site MUST accept to install inside
> >> their site an instance of a commonly validated software implementing the
> >> decision point.
> >
> > No. Each site may choose their own authz decision point, IMO.
> >
> >> The AUTHZ mechanism MUST NOT be internal to the BES :
> >
> > Maybe I was not clear. The authz mechanism is invisible for outside
> > parties (like clients). It can be an external component, an internal
> > component, whatever, it is up to the BES implementor and the site admin.
> >
> >> For example,
> >> in UNICORE atomic services, the 'de.fzj.unicore.uas.security' package
> >> is described as 'The security subsystem of UNICORE/X', and is NOT
> >> internal to 'de.fzj.unicore.uas.impl.job'.
> >
> > In UNICORE site admins can choose what attribute sources and XACML
> > decision points they want to use, but the clients (including other
> > services) do not need to know this. That is what I meant by "internal".
> >
> >>>
> >>> 6.5
> >>>
> >>> These are reqiurements on the security layer (or framework) and should
> >>> not be used as requirements on BES.
> >> These security requirements DO have impacts on the BES Client
> >> interface and on the Job Description document.
> >> In the text, I have made it clear.
> >
> > indeed while preparing the EMI-ES specification, we came to the
> > following conclusions
> > 1) when using proxies for delegation, it is necessary to map each data
> > staging item to a delegated credential (you can check the EMI-ES job
> > description for details)
> > 2) the delegation operations are separate from the job management
> > operations, so they do not necessarily have to be part of the BES client
> > interface.
> >
> > Also, there are existing implementations (UNICORE and Genesis come to my
> > mind) that do not need this at all, because they do delegation without
> > proxies.
> >
> > So I disagree, 6.5 mostly describes features of the particular security
> > framework that is used.
> >
> >>>
> >>> 8 BES requirements related to "Application Repositories"
> >>>
> >>> While I agree that BES should understand the notion of an "Application"
> >>> (see e.g. JSDL ApplicationName), I don't agree that the BES should
> >>> use these for Scheduling. Rather, this is the job of a broker.
> >> The text is now :
> >> * Resource selection : The Execution Service MUST use, among others,
> >> these references to 'Installed Applications' in order to select the most
> >> adequate computing resource for the Job.
> >>>
> >>> 9 BES requirements applying to Accounting
> >>>
> >>> As a "MUST", these are out of scope, and should be made "SHOULD".
> >> No Accounting --> No precise reporting to funding agencies --> No
> >> funding --> Abrupt and very long shutdown of all services.
> >> I fully confirm that Accounting is a MUST.
> >>
> >
> > ... operational
> >
> >>>
> >>> 10 Logging/Bookkeeping
> >>>
> >>> Same as 9.
> >> Same as 'Traceability' :
> >> This is an operational concern : Would you really take the risk that
> >> the whole EGI becomes a spambot or a scambot ?
> >> No Logging and Bookkeeping --> No post mortem analysis after attack
> >> --> Large infection --> Panic --> Abrupt and very long shutdown of all
> >> services.
> >> I fully confirm that Logging and Bookkeeping is a MUST.
> >>
> >
> > an operational MUST maybe for some infrastructures, not all.
> >
> > E.g. a typical HPC site has its own accounting, its own logging systems
> > independent of the (Grid) software used to submit jobs to it.
> >
> >>>
> >>> 12 Jobs
> >>>
> >>> 12.1 Types of job
> >>>
> >>> Support for parallel jobs: it should be "MUST" :-)
> >> The text is now :
> >> - The concept of 'Single Job' includes Jobs running
> >> massively-parallel processes using MPI on one large-scale HPC System.
> >> The Execution Service MUST understand instructions for usage of MPI
> >> inside the Job Description document. The Execution Service SHOULD
> >> transmit these instructions to the Batch System, or return an explicit
> >> error message if not supported.
> >
> > OK
> >
> >
> >
> > Best regards,
> > Bernd.
> >
> >
> > --
> > Dr. Bernd Schuller
> > Federated Systems and Data
> > Juelich Supercomputing Centre, http://www.fz-juelich.de/jsc
> > Phone: +49 246161-8736 (fax -8556)
> >
> >
> >
> >
> > ------------------------------------------------------------------------------------------------
> > ------------------------------------------------------------------------------------------------
> > Forschungszentrum Juelich GmbH
> > 52425 Juelich
> > Sitz der Gesellschaft: Juelich
> > Eingetragen im Handelsregister des Amtsgerichts Dueren Nr. HR B 3498
> > Vorsitzender des Aufsichtsrats: MinDirig Dr. Karl Eugen Huthmacher
> > Geschaeftsfuehrung: Prof. Dr. Achim Bachem (Vorsitzender),
> > Karsten Beneke (stellv. Vorsitzender), Prof. Dr.-Ing. Harald Bolt,
> > Prof. Dr. Sebastian M. Schmidt
> > ------------------------------------------------------------------------------------------------
> > ------------------------------------------------------------------------------------------------
>
--
Dr. Bernd Schuller
Federated Systems and Data
Juelich Supercomputing Centre, http://www.fz-juelich.de/jsc
Phone: +49 246161-8736 (fax -8556)
More information about the Pgi-wg
mailing list