Eric, Am 25.04.2017 um 22:58 schrieb Eric Biggers: > On Tue, Apr 25, 2017 at 09:22:16PM +0200, Richard Weinberger wrote: >> Eric, >> >> Am 25.04.2017 um 19:46 schrieb Eric Biggers: >>>> Sorry if this is a stupid question, but why do you have to compare hashes _and_ >>>> the last few bytes of the bigname? >>>> A lookup via bigname gives you two 32bits hash values, and there I'd assume that >>>> this is sufficient for a collisions free lookup. Especially since an >>>> resumed readdir() >>>> with a 64bits cookie has to work too on your filesystem. >>>> >>> >>> Well, the problem is that hashes may not be sufficient to uniquely identify a >>> name in all cases. f2fs uses only a 32-bit hash so it's trivial to create >>> collisions on it, as I demonstrated. Even collisions of two 32-bit hashes, as >>> used by ext4 and ubifs, are possible. And ext4 currently doesn't even compare >>> the hashes during directory searches, beyond using them to find the correct >>> directory block, since the hashes aren't stored in the directory entries. >> >> I agree that finding a collision in a 32bits hash is easy, but for 64bits it >> is *much* harder. > > That's true for accidental collisions, but malicious users might create > intentional collisions. In the case of UBIFS it looks like the first 32 bits of > the cookie depend solely only on the filename via key_r5_hash(), while the > second 32 bits is random. So I imagine a collision in the full 64 bits could be > generated by precomputing on average about 65536 filenames which collide in > key_r5_hash(), then creating them all in the same directory. Correct. As I said, I'll think of a way to check the remaining bytes in the bigname case. Thanks, //richard