[SAGA-RG] Fwd (andre at merzky.net): Re: More confusion

Andre Merzky andre at merzky.net
Wed Nov 21 14:15:56 CST 2007


Hi all, 

Cerial came upon an ugly problem with the current spec:  the
wildcards used in the namespace package collide with the
introduction of URLs, as several characters used for
wildcards lead to not-well formed URLs.  That problem was
not present back then when we used strings instead of the
saga::url class.

Below is an email exchange describing the problem with
examples.  Opinions on how to solve that _nicely_ are very
welcome.

Thanks, Andre.


----- Forwarded message from Andre Merzky <andre at merzky.net> -----

> Quoting [Ceriel Jacobs] (Nov 20 2007):
> > 
> > Ceriel Jacobs wrote:
> >
> >> Ceriel Jacobs wrote:
> >>
> >>> Hi,
> >>> 
> >>> I am now looking at wildcard expansion, and am totally
> >>> confused as to where/when that should take place. The
> >>> ns_directory methods copy(), move(), link() and remove()
> >>> seem reasonable targets, but they all take an URL
> >>> parameter.  This is sort of OK with the '*' wildcard,
> >>> but using any of ?, [, ], {, } results in an invalid
> >>> URL. For instance,
> >>> ftp://ftp.cs.vu.nl/pub/ceriel/LLgen.?ar.gz is not a
> >>> valid URL.
> 
> Ah!  Well, the wildcard spec still assumes that the
> parameters are strings, not URLs :-(
> 
> 
> >> Ahum, this IS a valid URL, with a query part. Anyway, not
> >> what is intended.
> 
> Right.
> 
> 
> > OK, it can be done with %-escapes: I now get a match for
> > 
> > ftp://ftp.cs.vu.nl/pub/ceriel/LLgen.%7Btar%2Cnoot%7D.gz
> 
> Yes, I guess that works, but that leaves the effort to write
> escaped charactes to the end user -- probably not what we
> want.
> 
> So, the problem really is to distinguish between characters
> which the user added to describe wildcards, and characters
> the user added to describe legitimate URL parts.  Thats
> impossible I'm afraid :-(
> 
> Ugh.
> 
> Justing dumping random thoughts from here:
> 
> One option would be to forbid query parts etc on URLs.  But
> that would be a severe limitation.
> 
> Another option would be to 'mark' wildcard characters, e.g.
> to escape them with '\':
> 
>   ftp://ftp.cs.vu.nl/pub/ceriel/LLgen.\?ar.gz
> 
> and to transform it internally into a/multiple valid URL(s).
> That would imply that the saga::url class would need to be
> aware of the escaping (so you cannot use a native URL
> class), and the user still has to do some work.
> 
> Another option is to revert to strings.  Which removes your
> parsing error, but does not solve the problem semantically -
> at some point, the string needs to be converted into URLs.
> 
> And yet another option is not to use wirldcards in the spec
> - which is not really an option at this stage I guess, and
> would be a pity as well.
> 
> 
> -----
> 
> So, I do not have a good answer at the moment, and will
> ponder some more on this.  Do you mind if I forward the
> question to Hartmut/Ole and to the list?
> 
> Cheers, Andre.
-- 
No trees were destroyed in the sending of this message, however,
a significant number of electrons were terribly inconvenienced.


More information about the saga-rg mailing list