Re: Readdir d_off encoding

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 01/07/2015 04:16 PM, J. Bruce Fields wrote:
On Mon, Dec 22, 2014 at 02:04:37PM -0500, J. Bruce Fields wrote:
It'd also be nice to see any proposals for a completely correct
solution, even if it's something that will take a while.  All I can
think of is protocol extensions, but that's just what I know.

I tried to think a little about this over the holidays: say we could
scrap NFS and start from scratch, what would we do?:

- larger NFS readdir cookies: if NFS cookies were 128 bits, then gluster
   could stick the filesystem's offset in the lower 64 bits and its own
   data in the upper 64 bits.

Dan was mentioning the other day about _negotiating_ and then setting the cookie size, in case this is being done from scratch. Thought this is worth mentioning, as it would be a good move.


   This doesn't work if anyone else does this, though: if we change to
   128 bits here then people may eventually want to do the same thing to
   filesystem and systemcall interfaces too and then we're back at square
   one.  If people want to be able to stack arbitrary readdir
   implementations the we can't really choose a fixed size limit any
   more.

- stateful readdir: make clients open the directory, read through it
   from start to finish, then close it.  That's all clients really want
   to do anyway--they don't need to seek back to offsets returned
   arbitrarily long ago.  However, they do need to be able to resend the
   last readdir request in case the reply was lost, and they do need to
   be able to resume reading a directory after a server reboot.

   So I think that would still leave gluster needing to keep a
   (persistent, on-disk) cache mapping the NFS cookies it hands out to
   the offsets in the backend directories.  The difference is just that
   it would only have to cache the small number of entries that are in
   use by current readdirs in progress instead of potentially having to
   keep them all forever.  I don't know, does that help much?

--b.

_______________________________________________
Gluster-devel mailing list
Gluster-devel@xxxxxxxxxxx
http://www.gluster.org/mailman/listinfo/gluster-devel




[Index of Archives]     [Gluster Users]     [Ceph Users]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Security]     [Bugtraq]     [Linux]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux