[SAGA-RG] URL class in SAGA?

Jon MacLaren maclaren at cct.lsu.edu
Thu Aug 30 05:51:13 CDT 2007


But there are problems with the Java URL Class - it has some very odd  
behavior.

When you create a URL, there is an attempt to lookup the host name.   
Two URL objects created from the same identical address, e.g.

	https://scoop.cct.lsu.edu:9877/ocracoke

...but at different times (e.g. once when the host could not be  
looked up, once when it could) can produce different hash values.  So  
if you use these objects as keys in a HashMap, they can sometimes  
fail to lookup properly.

Worse, when invoking equals on two URLs, host lookups are again  
resolved.  If your DNS server is down, the comparison can hang - this  
depends on the underlying OS's behavior.  On Linux this used to hang  
indefinitely until the server came back!

I've discovered these features over a period of a couple of years.   
They are horrible.

You should seriously consider using strings in an underlying Java  
implementation - maybe creating a URL object just to do the format  
checking, then throwing it away.

Another nice feature, is that Java, by default, will cache DNS  
lookups forever, i.e. until the application finishes.  If this is a  
Tomcat container, then you won't see legitimate DNS changes until the  
container is restarted.  That could be months at a time.  To change  
that, you have to edit java.security - which is a machine-wide change  
for all applications.

Nice.

Jon.


On Aug 29, 2007, at 7:41 PM, Andre Merzky wrote:

> I seem to remember that the topic of an URL class was, once
> upon a time, discussed by the group, but I can't find any
> notes nor mails about it.
>
>> From what I remember, the consensus was that several
> languages do have a native URL/URI class (e.g. Java), and
> that then the language bindings should allow to use these
> classes instead or additionally to strings.
>
> However, Ole and Hartmut argued that the C++ bindings should
> have an URL class, too, as the task of generating and
> parsing teh URLs would be left as a tedious excercise to the
> end user.
>
> Now, if we are going to specify URL classes for basically
> all bindings, then we may as well consider to use a simple
> URL class in the spec.  OTOH, that might be overkill, and
> the bindings may be just the right place to deal with that
> topic.
>
> As I am blessed with a very poor memory, I really can't
> remember if that was discussed already, or if I for example
> was opposed to an URL class myself :-P  Anyway, I'd like to
> raise the question (again), and would love to hear your
> opinions.
>
> Thanks, Andre.
>
>
> PS.: _if_ we would include an URL class in the spec, I'd
> insist on keeping it as simple and small as possible.  No
> need for fancyness here - setters and getters for the URL
> elements is all what we basically need, right?
>
> -- 
> "XML is like violence: if it does not help, use more."
>
> --
>   saga-rg mailing list
>   saga-rg at ogf.org
>   http://www.ogf.org/mailman/listinfo/saga-rg
>



More information about the saga-rg mailing list