Hi Mukund, Thanks for your comments below. I agree that some adjustments to the wording in the draft would be beneficial. Thanks, Donald =============================== Donald E. Eastlake 3rd +1-508-333-2270 (cell) 2386 Panoramic Circle, Apopka, FL 32703 USA d3e3e3@xxxxxxxxx On Tue, Oct 15, 2024 at 1:46 AM Mukund Sivaraman <muks@xxxxxxxxxx> wrote: > > Dear Mr. Eastlake, > > My minor review of the current draft: > > [snip] > > > The FNV Non-Cryptographic Hash Algorithm > > draft-eastlake-fnv-29 > > > Abstract > > > FNV (Fowler/Noll/Vo) is a fast, non-cryptographic hash algorithm with > > good dispersion that is referenced in a number of standards documents > > [snip] > > > 1. Introduction > > > FNV hashes are designed to be fast while maintaining a low collision > > rate. Their exceptional dispersion makes them particularly well- > > suited for hashing nearly identical strings, including URLs, > > hostnames, filenames, text, and IP addresses. Their speed allows one > > to quickly hash lots of data while maintaining a reasonably low > > collision rate. > > In the above, marketing words are used to describe this function such as > "good" and "exceptional" dispersion, "low collision rate," etc. But > there doesn't appear to be any analysis presented to justify these, > either in the draft or in the references. > > As this is the draft about FNV that is seeking to become an RFC, if a > reader in the future wonders "Why are the authors saying these things?" > he would want to read an analysis. I suggest including results of > fitness tests that justify the words "exceptional dispersion." > > [snip] > > > 7.1. Why is FNV Non-Cryptographic? > > > A full discussion of cryptographic hash requirements and strength is > > beyond the scope of this document. However, here are three > > characteristics of FNV that would generally be considered to make it > > non-cryptographic: > > > 1. Sticky State - A cryptographic hash should not have a state in > > which it can stick for a plausible input pattern. But, in the > > very unlikely event that the FNV hash variable becomes zero and > > the input is a sequence of zeros, the hash variable will remain > > at zero until there is a non-zero input byte and the final hash > > value will be unaffected by the length of that sequence of zero > > input bytes. Of course, for the common case of fixed length > > input, this would usually not be significant because the number > > of non-zero bytes would vary inversely with the number of zero > > bytes and for some types of input, runs of zeros do not occur. > > Furthermore, the use of a different offset_basis or the inclusion > > of even a little unpredictable input may be sufficient to stop an > > adversary from inducing a zero hash variable. > > I suggest including pseudo-code on how inducing a zero hash state can be > prevented by including unpredictable input on a per-input basis. Is it > the same unpredicable input to be added for all inputs, or does it have > to be changed per input? The above text appears to treat it casually, > i.e., less than rigorously. > > [snip] > > > 7.2. Inducing Collisions > > > While use of a cryptographic hash should be considered when active > > adversaries are a factor, the following attack can be made much more > > difficult with very minor changes in the use of FNV: > > > If FNV is being used in a known way for hash tables in a network > > server or the like, for example some part of a web server, an > > adversary could send requests calculated to cause hash table > > collisions and induce substantial processing delays. As mentioned in > > Section 2.2, use of an offset_basis not known by the adversary will > > substantially reduce or eliminate this problem. > > This may not be the only problem. > > If a service has a collection (e.g., a cache) that uses a hashtable > internally with 1<<N buckets, it may provide a way to inspect the > collection such as by allowing it to be dumped or browsed. The > implementation may not sort the collection when dumping or presenting > from a hashtable, and it would be naively dumped in the order it occurs > in the hash table. > > If such data can be inspected by an adversary, it may be possible to > recover the 32-bit or 64-bit offset basis from using the order of keys > in such a dump. > > For general purpose hashtable implementations (and in an off-the-shelf > library that a programmer would use as a black-box) where an adversary > can choose the keys that are inserted and later searched, this hash > function has weaknesses that the programmer would have to consider and > workaround, and for that reason it's generally unsuitable for use in a > hashtable esp. within a public facing network service. > > I suggest covering the above in the section. > > This draft's abstract states that its purpose is documenting this > function, for which it is fine. > > Even for non-cryptographic use I suggest noting what it is not good for > in "1.1 FNV Hash Uses." > > The phrase "The FNV hash is widely used" comes across as typical > marketing text to sell something. When one reads it, one may think "my > peers use it, so it should be a good choice," i.e., its popularity > influences the reader aside from its qualities. Popularity is no > indication of quality, as can be observed in some things in the world > today. The statement may not remain true in the future. I understand the > authors did not add it as marketing text. It's best to leave such text > out from an internet draft. > > Mukund -- last-call mailing list -- last-call@xxxxxxxx To unsubscribe send an email to last-call-leave@xxxxxxxx