[glue-wg] New Endpoint and Service types

Paul Millar paul.millar at desy.de
Thu Mar 6 06:34:13 EST 2014


Hi Stephen,

On 05/03/14 19:10, stephen.burke at stfc.ac.uk wrote:
> Paul Millar [mailto:paul.millar at desy.de] said:
>> Stephen, to you DPM, dCache and Castor provide the same
>> functionality, so you would be happy with instances of all three
>> published as Service.Type of 'storage' (or similar).
>
> Not entirely - the object names are prefixed with "Storage" anyway,
> so simply publishing a Type of "storage" would be redundant.

Yup, fair point.

> Also it seems to me that something like a standalone xrootd server or
> a "classic SE" as we used to have would reasonably be different types
> of storage service, even aside from the details of which protocols
> they support.

Possibly, but then perhaps DPM and dCache also provide sufficiently 
different storage behaviour to quality.  It's difficult without a 
benchmark to decide.

> However, we do have a family of SRM-based SEs which seem to me to
> represent a commmon type - indeed I thought that one of the goals of
> EMI for dcache, DPM and StoRM was precisely to make them
> interoperable!

I think you continue to have this fallacy that dCache is somehow
SRM-based.  Yes, one can read data using SRM, but one can easily read
the same data from the same dCache instance without using SRM: switch 
off SRM and dCache has always works perfectly well.

The interoperability has always been at the protocol level, not
the implementation level, so should appear in the Endpoint or
AccessProtocol.

> In the past I would have suggested "SRM" as a Type, but since we now
> seem to be making moves away from the use of SRM that may not be
> ideal as a name.

 From my pov, 'SRM' was never a good name: SRM is a protocol, not a
storage system.  dCache, at least, has never been "based on" SRM.

> From a dcache POV, what do you see as providing commonality with DPM
> and StoRM? (Beyond all being storage systems.)

One commonality between dCache and DPM is the immutable nature of stored
data: once written, data may only be modified by replacing the old data
with completely new data.  I think StoRM also provides an immutable 
filesystem, but a StoRM person would need to confirm.

However, this immutable nature could change in the (not too distant) future.

Other than that, I don't think there's much that's similar: they're
rather different implementations, with different design choices.

>> Somebody who needs some unique characteristic provided by dCache
>> (or DPM, or ...) might want more detailed Type, specifically that
>> the service provides the dCache-like facilities (or DPM-like or
>> ...).
>
> If someone really wants to know the implementation they can look at
> the EndpointImplementationName or ManagerProductName - although of
> course it's undesirable to have anything which is
> implementation-specific.

Both certainly true: they can look at the Manager.ProductName and that 
tying behaviour to implementation is undesirable.


> For me, to be a valid type it would have to be the case that a
> completely different vendor could potentially produce an independent
> product which could reasonably be described as "a DPM" or "a dcache"
> - even conceptually, can you see such a thing as being meaningful? If
> so, how would you define it? You use "dcache-like" above, but what
> does that mean (in terms of external interfaces)?


I think the problem with Type is in deciding the use-case for querying 
it.  When would they query Type rather than, say, Manager.ProductName? 
AFAIK, we don't have concrete examples where this information is useful.

In terms of "dCache-like", there are any number of behavioural 
characteristics that distinguish dCache from DPM; for example, hot-spot 
detection and mitigation, overload protection, ability to stage file 
from tape, ...  A client may adjust its behaviour if it detects that the 
storage system is "dCache-like" (or if it isn't dCache-like).

As you point out, this could be discovered through Manager.ProductName, 
so it goes back to the above point: what are the use-cases for querying 
StorageService.Type?


>> For the xrootd protocol, dCache currently publishes
>>
>> Endpoint.URL: xroot://xrootd-door.example.org/
>> EndpointInterface.Name: xroot and StorageAccessProtocol.Type:
>> xrootd
>
> What protocol name do you recognise in e.g. a getTURL operation to
> return an xroot TURL?

Currently it's 'root://'

 > Does it match what DPM and StoRM use?

I couldn't say: you would need to ask DPM and StoRM people.

 > What about webdav?

dCache SRM will return a TURL that starts 'http://' or 'https://'.

>> For WebDAV, dCache is currently publishing as either 'http' or
>> 'https', depending on whether SSL/TLS tunnelling is enabled or
>> not.
>
> Bear in mind that the scheme name in the URL is not the same as the
> InterfaceName. I don't know a lot about webdav but my impression is
> that it's far from being identical with http as far as file access
> goes, so I would expect a different InterfaceName even if the URL is
> https:// (c.f. SRM vs. httpg://).

I'm pretty sure that, for uploading and downloading data, the HTTP and 
WebDAV requests *are* identical.

WebDAV is about adding the "missing file-system ideas", like the concept 
of directories.

>> When publishing an Endpoint object the describes an HTTP or a
>> WebDAV endpoint with unencrypted access then the URL SHOULD start
>> 'http://' and the InterfaceName SHOULD be 'http'.  If the endpoint
>> is encrypted then the URL SHOULD start 'https://' and the
>> InterfaceName SHOULD be 'https'. If the endpoint supports WebDAV
>> then a SupportedProfile of 'http://webdav.org/' SHOULD be
>> published.
>
> If it's necessary to make that distinction think I would prefer to
> publish both http and webdav endpoints, doing it your way would seem
> likely to be error-prone.

Yes, but please bear in mind that there are many extensions that build 
on top of HTTP and that an endpoint may support many (into 
double-digits) of them concurrently.  Publishing an endpoint for each 
results in (excessive?) duplication.

While publishing multiple endpoints is possible, I was hoping we could 
come up with something better.

Cheers,

Paul.



More information about the glue-wg mailing list