Re: [RFC PATCH 0/5] Union Mount: A Directory listing approach with lseek support

Dave Hansen <haveblue@xxxxxxxxxx> · Thu, 06 Dec 2007 09:54:43 -0800

On Thu, 2007-12-06 at 11:01 +0100, Jan Blunck wrote:
> > Rather than give each _dirent_ an offset, could we give each sub-mount
> > an offset?  Let's say we have three members comprising a union mount
> > directory.  The first has 100 dirents, the second 200, and the third
> > 10,000.  When the first readdir is done, we populate the table like
> > this:
> > 
> >       mount_offset[0] = 0;
> >       mount_offset[1] = 100;
> >       mount_offset[2] = 300;
> > 
> > If someone seeks back to 150, then we subtrack the mount[1]'s offset
> > (100), and realize that we want the 50th dirent from mount[1].
> 
> Yes, that is a nice idea and it is exactly what I have implemented in my patch
> series. But you forgot one thing: directories are not flat files. The dentry
> offset in a directory is a random cookie. Therefore it is not possible to have
> a linear mapping without allocating memory.

Is is truly a random cookie where the fs gets to put anything in there
that it can fit in a long?  Isn't that an advantage?  Being a random
cookie, we can encode anything we want in there.  We could encode the
"mount index" (or whatever you want to call it) in high or low bits and
have the rest store the fs-specific offset.  The only problem there
would be running out of storage space in long.

But, what do people expect when they have huge directories (or huge
directory offsets)?  Surely, if their fs is already pressing the fs's
directory data types to the limits, they can't simply union mount a
couple of those directories together and expect magic.  Union mounts
also can't be expected to compress these directory positions to fit.

So, I think it's reasonable behavior to readdir() until the sum of the
highest position seen on each mount would overflow the off_t that we
have to store it in.  

-- Dave

-
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html