[SAGA-RG] Fwd (andre at merzky.net): Re: More confusion

Andre Merzky andre at merzky.net
Tue Nov 27 00:45:19 CST 2007


Hi, 

Quoting [Thilo Kielmann] (Nov 26 2007):
> 
> All,
> 
> I did a global search for "wildcard" in the SAGA core spec.
> The result is that we are having three places using wildcards:
> 
> 1. attributes
> 2. logical directory (using both attribute and path wildcards)
> 3. namespace.directory, using path wildcards.
> 
> 
> Attribute wildcards don't pose a problem (at least to me, or until
> Ceriel will find one ;-)

Attributes will stay strings for the time beeing (i.e. until
we introduce properly typed attributes).  So, wildcrads
should not be a problem for the moment.


> The path wildcards from namespace.directory, however, do bring a problem,
> in combination with URLs.
> 
> If I remember correctly, we switched from strings to URLs for a good reason.

Yes, one beeing to enforce parsing on the strings - which is
exactly where it bites us now :-P


> URLs, however, do not allow for wildcards, according to RFC1738.

Well, RFC1738 actually refers wildcards explicitely, e.g. in
3.6. NEWS:

    If <newsgroup-name> is "*" (as in <URL:news:*>), it is
    used to refer to "all available news groups".

And here are two other options actually for dealing with
wildcards:

  - allow only *, not the full blown shell wirldcards

  - or use different characters for wildcards, e.g.

    data_[a-z].bin -> data_((a-z)).bin
    image.?pg      -> image01.#pg

I would find the second one slightly confusing, but an
option it is.


> And the here mentioned query parts of URLs are for http only, and not for
> files as we would need them here.

Well, http URLs can refer to files...


> If we define some "URL with wildcards" that would no longer be URLs, so this
> is no way to go.
> 
> 
> Why do we want/need wildcards for?
> The core spec writes about "shell wildcards", so we want to apply a single
> operation to several namespace entries at a time.
> (e.g.: move, copy, find,...)

Right.


> This reminds me of bulk operations with SAGA tasks. But this also feels like
> "overkill" for the use case of file wildcards.

Well, it seemed sensible and easy back then when we had
strings.  Actually, wild cards are just an API optimization,
right?  It can always be done in user space... (ls + loop + filter).

So we thought that wildcards can, in the worst case (e.g. if
not supported by the backend), be provided by the
implementation, with no penalty if compared to application
level code.

Yes, we can use the bulk mechanism, but that puts the burden
of wildcard expansion back into user code.  Unless one
provides the expand method of course ;-)


> My suggestion is thus to follow Ceriel (version 2):
> 
> On Thu, Nov 22, 2007 at 11:12:58AM +0100, Ceriel Jacobs wrote:
> > 
> > Another approach would be to have an explicit method to do wildcard expansion.
> > For instance, in namespace.ns_directory:
> > 
> >        expand        (in string pattern,
> >                       out array<saga::url> urls);
> > 
> > Here, the pattern only specifies the "path" part, but with wildcards (the directory
> > implicitly specifies the rest of the url). I am not sure whether the resulting urls
> > should be resolved with respect to the directory or not. I think not.
> 
> I think we need to spend some good thoughts on getting the parameters to this
> call right (do we need a pattern to compose the URLs from the expanded 
> patterns???)

I am not sure what the last sentence means :-(  the returned
items _are_ URLs, so what other URLs do you want to compose?
Sorry for being thick...


> Besides this "expand" method, we would have to change the relevant
> namespace.directory methods to accept arrays of URLs instead of individual
> URLs.

That makes coding slightly awakward if you want to copy a
single file, as you'd need to create an array for that
single URL.  So we would need two calls (one with array, one
without) which would again bloat the API.  So I'd rather
vote for requireing the code to loop over the entries and to
use the normal (singular) calls...

> The other radical approach could be: remove file name wildcards alltogether...
> 
> 
> More thoughts?
> 
> 
> Thilo

My favourite at the moment: 

  - allow * as wildcard in URLs
  - for all other wirldcards ([a-z], ?, {one,two,three}) use
    expand(), and require user level loops over te result.

Cheers, Andre.

-- 
No trees were destroyed in the sending of this message, however,
a significant number of electrons were terribly inconvenienced.


More information about the saga-rg mailing list