Re: regressions due to 64-bit ext4 directory cookies

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 02/12/2013 10:00 PM, J. Bruce Fields wrote:
On Tue, Feb 12, 2013 at 09:56:41PM +0100, Bernd Schubert wrote:
On 02/12/2013 09:28 PM, J. Bruce Fields wrote:
06effdbb49af5f6c "nfsd: vfs_llseek() with 32 or 64 bit offsets (hashes)"
and previous patches solved problems with hash collisions in large
directories by using 64- instead of 32- bit directory hashes in some
cases.  But it caused problems for users who assume directory offsets
are "small".  Two cases we've run across:

	- older NFS clients: 64-bit cookies cause applications on many
	  older clients to fail.
	- gluster: gluster assumed that it could take the top bits of
	  the offset for its own use.

In both cases we could argue we're in the right: the nfs protocol
defines cookies to be 64 bits, so clients should be prepared to handle
them (remapping to smaller integers if necessary to placate applications
using older system interfaces).  And gluster was incorrect to assume
that the "offset" was really an "offset" as opposed to just an opaque
value.

But in practice things that worked fine for a long time break on a
kernel upgrade.

So at a minimum I think we owe people a workaround, and turning off
dir_index may not be practical for everyone.

A "no_64bit_cookies" export option would provide a workaround for NFS
servers with older NFS clients, but not for applications like gluster.

For that reason I'd rather have a way to turn this off on a given ext4
filesystem.  Is that practical?

I think Ted needs to answer if he would accept another mount option. But
before we are going this way, what is gluster doing if there are hash
collions?

They probably just haven't tested NFS with large enough directories.

Is it only related to NFS or generic readdir over gluster?

The birthday paradox says you'd need about 2^16 entries to have a 50-50
chance of hitting the problem.

We are frequently running into it with 50000 files per directory.


I don't know enough about ext4 directory performance.  But unfortunately
I suspect there's a range of directory sizes that are too small to have
a significant chance of having directory collisions, but still large
enough to need dir_index?

Here is a link to the initial benchmark:
http://search.luky.org/linux-kernel.2001/msg00117.html


Cheers,
Bernd
--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [Reiser Filesystem Development]     [Ceph FS]     [Kernel Newbies]     [Security]     [Netfilter]     [Bugtraq]     [Linux FS]     [Yosemite National Park]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Samba]     [Device Mapper]     [Linux Media]

  Powered by Linux