Multi-homed hosts are treated specially as all available IP-addresses returned from DNS are stored in the cache. Every time a request is made to the host, the time-to-connect is measured and a weight function is calculated to indicate how fast the IP-address was. The weight function used is
where indicates the
sensitivity of the function and
is the connect time. If one IP-address is
not reachable a penalty of x seconds is added to the weight where the
penalty is a function of the error returned from the "connect"
call. The next time a request is initiated to the remote host, the
IP-address with the smallest weight is used.
A problem with both the host name cache and the data object cache is to detect when two URLs are equivalent. The only way this can be done internally in the Library is to canonicalize the URLs before they are compared. This has for some time been done by looking at the path segment of the URLs and remove redundant information by converting URLs like
foo/./bar/ = foo/redundant/../bar/ = foo/bar/The method is optimized and expanded so that also host names are canonicalized. Hence the following URLs are all recognized to be identical:
http://www/ = http://www.w3.org:80/ = http://Www.W3.Org/ = http://www.w3.org./ = http://www.w3.org/However, the canonicalization does not recognize alias host names which would require that this information is stored in the cache. In order to do this, a separate resolver library must be provided as this information is normally not returned by the default resolver libraries. Also these library do not support non-blocking sockets and hence delay can not be avoided when resolving a host name. The solution is of course to write a resolver library which handles these features, and it is under consideration.