Re: I'm a total push-over..

Jeremy Maitin-Shepard <jbms@xxxxxxx> · Fri, 25 Jan 2008 13:19:15 -0500

Johannes Schindelin <Johannes.Schindelin@xxxxxx> writes:

> On Fri, 25 Jan 2008, Jeremy Maitin-Shepard wrote:

>> But since multiple hash functions will be needed anyway to support 
>> different notions of case-insensitivity, if the warning is not enabled, 
>> there is no reason to use a case-insensitive hash function with a 
>> byte-exact comparison.

> No, only multiple compare functions will be needed.  The hash function can 
> be built in such a manner that it guarantees that file names being equal 
> with _any_ of the compare functions fall into the same bucket.

In theory, I agree that this is possible, but in practice it may not be
reasonable at all.  Consider two possible comparison functions:

1. compare file names as strings case-insensitively assuming a latin 1
encoding

2. compare file names as strings case-insensitively assuming a UTF-8
encoding

Actually writing a hash function such that two strings hash to the same
value if either of these comparison functions says that the strings are
equal would appear to be rather difficult.

> The upside of such a hash function: less code to maintain.

A simple hash function that doesn't try to do anything regarding
case-insensitivity is extremely short and simple and therefore is hardly
a maintenance burden.

Although in some cases it is possible to "share" a hash function, except
for the "warning" purpose, actually doing so doesn't make much sense.
Using the "case-insensitive" hash function when you intend to use an
"exact" comparison function just amounts to using a hash function that
is unequivocally worse: it is slower, more complicated, and has a higher
collision rate.

-- 
Jeremy Maitin-Shepard
-
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html