Re: [PATCH 5/6] f2fs: switch to using fscrypt_match_name()

Richard Weinberger <richard@xxxxxx> · Tue, 25 Apr 2017 23:03:01 +0200

Eric,

Am 25.04.2017 um 22:58 schrieb Eric Biggers:
> On Tue, Apr 25, 2017 at 09:22:16PM +0200, Richard Weinberger wrote:
>> Eric,
>>
>> Am 25.04.2017 um 19:46 schrieb Eric Biggers:
>>>> Sorry if this is a stupid question, but why do you have to compare hashes _and_
>>>> the last few bytes of the bigname?
>>>> A lookup via bigname gives you two 32bits hash values, and there I'd assume that
>>>> this is sufficient for a collisions free lookup. Especially since an
>>>> resumed readdir()
>>>> with a 64bits cookie has to work too on your filesystem.
>>>>
>>>
>>> Well, the problem is that hashes may not be sufficient to uniquely identify a
>>> name in all cases.  f2fs uses only a 32-bit hash so it's trivial to create
>>> collisions on it, as I demonstrated.  Even collisions of two 32-bit hashes, as
>>> used by ext4 and ubifs, are possible.  And ext4 currently doesn't even compare
>>> the hashes during directory searches, beyond using them to find the correct
>>> directory block, since the hashes aren't stored in the directory entries.
>>
>> I agree that finding a collision in a 32bits hash is easy, but for 64bits it
>> is *much* harder.
> 
> That's true for accidental collisions, but malicious users might create
> intentional collisions.  In the case of UBIFS it looks like the first 32 bits of
> the cookie depend solely only on the filename via key_r5_hash(), while the
> second 32 bits is random.  So I imagine a collision in the full 64 bits could be
> generated by precomputing on average about 65536 filenames which collide in
> key_r5_hash(), then creating them all in the same directory.

Correct. As I said, I'll think of a way to check the remaining bytes in the bigname
case.

Thanks,
//richard