On Mon, 14 Dec 2009 11:04:01 -0500 "J. Bruce Fields" <bfields@xxxxxxxxxxxx> wrote: > On Mon, Dec 14, 2009 at 10:52:14AM -0500, Jeff Layton wrote: > > On Mon, 14 Dec 2009 10:24:18 -0500 > > "J. Bruce Fields" <bfields@xxxxxxxxxxxx> wrote: > > > > > On Mon, Dec 14, 2009 at 08:38:43AM -0500, Jeff Layton wrote: > > > > I looked at this problem recently based on a request by some of our > > > > coreutils folks. A bit of the discussion is here: > > > > > > > > https://bugzilla.redhat.com/show_bug.cgi?id=533569 > > > > > > > > ...and earlier: > > > > > > > > https://bugzilla.redhat.com/show_bug.cgi?id=501848 > > > > > > > > Jim Meyering also brought this up on LKML: > > > > > > > > http://lkml.org/lkml/2009/11/4/451 > > > > > > > > I'm a little leery of triggering a mount for any server-side mountpoint > > > > that we just happen to have a peek at. That seems like it might get > > > > expensive. Suppose you had 1000 filesystems mounted under the root > > > > share here? > > > > > > For what it's worth, I'll admit that I ran across this just in > > > artificial testing--I'm not claiming it was causing me a real problem. > > > > > > > Understood. It's a bit of a dilemma... > > > > Clearly though, it's going to be a problem for some programs that need > > to deal with mountpoints (stuff like backup programs in particular). > > The problem though is that I don't think we want to trigger a bunch of > > submounts just because someone does a "ls -l" in a directory that holds > > a bunch of server-side mountpoints. > > > > The real problem I think is that we allocate new dev minor numbers at > > mount time. > > So you're not saying that the minor number allocation is the expensive > part, you're saying that it's cheap and something that we could do > before we do the rest of the mount? > Yeah, minor number allocation is fairly cheap (it's just IDA hash calls I think). If we wanted to try and preallocate them then we have to consider how long to cache them too. The hassle of doing that may outweigh the expense of triggering I am making an assumption that mounts are somewhat expensive to do. Maybe you can convince me otherwise. :) > > The ideal thing might be to have the client somehow > > pre-determine what the dev number of that mount would be without > > actually doing the mount. Then we could just present that device number > > in the stat call. > > We also need the inode number, for example, which may require an rpc > call. > We already have that, right? We've done a GETATTR (or equivalent) and noticed that the inode has a different fsid. We even send back the "real" inode number in the statbuf (but obviously w/o the right device info since the mount hasn't been triggered yet). > So what is the most expensive part of a mount? > > For a directory full of referral points, there's the problem that you > don't want to have to wait on stat calls from a lot of different > servers. But maybe that should be handled as a special case. > Good question. I suppose I was making an assumption here that triggering a mount would mean at least some RPC's and that might make that "ls -l" stall for a while if we have to talk to a bunch of different servers. Maybe I'm blowing the performance hit out of proportion though? That said, I'm not crazy about altering the generic VFS to fix this. I wonder if there's another way to do it? -- Jeff Layton <jlayton@xxxxxxxxxx> -- To unsubscribe from this list: send the line "unsubscribe linux-nfs" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html