Re: [RFC] mke2fs -E hash_alg=siphash: any interest?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



> Now that the patches are available, it makes sense to run some
> directory-intensive benchmark to see whether the improved hash
> function actually shows improved performance.  The hash may be
> somewhat faster, but since this is only hashing the filename and
> not KB/MB of data, it isn't clear whether this is going to improve
> observable performance of directory operations.

That's basically my current task, and why my v1 is kind of a draft
just to introduce the idea and flush out any comments on my choice of
identifier names and stuff like that.

Personally, I just like the cleanliness of using a primitive designed for
the purpose, but I benchmarked it to ensure it wouldn't be any *slower*.

> I'm not sure what a suitable benchmark for this is, however.  It
> needs to be doing filename lookups to exercise the hashing, but
> in the workloads that I can think of there is always a lot more
> work after the name is looked up (e.g. open(), stat(), etc) on
> the filename.  Some possibilities include "ls -l" or "mv A/* B/".
> It may be the only way to see the difference is via oprofile.

It's worse than that.  The dcache has an great hit rate, and you have to
force misses.  But if you actually hit the disk a lot, that will dwarf
hashing performance into unmeasurability.

So it requires a very cleverly designed benchmark to highlight it.

> It also isn't clear whether the strength of siphash is significantly
> better than "halfmd4", which is already cryptographically-strong.
> Since the filename hash is also a function of the filesystem-unique
> s_hash_seed, mounting an "attack" on a directory needs to be specific
> to a particular filesystem, and isn't portable to other filesystems.

There are two definitions of "stronger":

1) The unknowable truth, and
2) It has been subjected to a lot of analysis and appears to hold up well.

By criterion 2, SipHash *is* significantly stronger: it's presented at
crypto conferences, been studied, and is widely used.

halfmd4 a very ad-hoc primitive that I don't think anyone's looked at
seriously.

It's not obviously terrible, and it's possible that halfmd4 is more work
to break, but we won't know until someone with cryptanalytic skill takes
a swing at it.
--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html




[Index of Archives]     [Reiser Filesystem Development]     [Ceph FS]     [Kernel Newbies]     [Security]     [Netfilter]     [Bugtraq]     [Linux FS]     [Yosemite National Park]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Samba]     [Device Mapper]     [Linux Media]

  Powered by Linux