Hi David- On Fri, Aug 1, 2008 at 9:56 AM, David Woodhouse <dwmw2@xxxxxxxxxxxxx> wrote: > On Fri, 2008-08-01 at 14:36 +0100, David Woodhouse wrote: >> Those are needed by NFSv3 too -- and can be handled with a lookup_fh() >> method in the file system which is guaranteed to be called from within >> the filldir callback, and some support in the VFS for checking if it's a >> mountpoint. >> >> NFSv4 introduces another problem though, which is that it seems to be >> able to return the _full_ getattr() results for each object, and there's >> no real way round the fact that we need to do the ->lookup() for that. >> >> If sane clients aren't expected to _ask_ for that, though, then perhaps >> it would be OK to fall back to something like the existing >> readdir-to-buffer hack for that case, while most normal clients won't >> trigger it. Sanity / insanity is probably not the right description for these different types of clients... NFSv4 is designed to work with clients that have a typical VFS (like an in-kernel Unix client), or user-space clients, or clients on systems that don't have typical VFS APIs (like Windows). So, servers have to be prepared to expect a wide gamut of different combinations of individual operations via compound RPCs. In practice, nearly all client implementations so far are VFS-based in-kernel clients, and have roughly the same kinds of readdir (ie driven by getdents(3) and allowing seek offsets like a byte stream). But they are at generally different stages of implementation, and have different ways to go about their work. However, speaking generally, the advanced features of NFSv4, like FS_LOCATIONS and pseudofs often do require some special sauce that is sometimes not terribly friendly to the server-side VFS. I only mentioned that the Linux client doesn't use WORD0_FILEHANDLE to caution against testing any server change with a Linux NFSv4 client -- it wouldn't necessarily exercise the server code in question. Some NFSv3 clients don't support READDIRPLUS at all, while some can disable it (like Linux, Mac OS, and FreeBSD), and others use it only in certain cases (Linux). I wouldn't describe any of these as saner or more commonly encountered than another. > Or maybe we could just mask the offending attrs out of ->rd_bmval for > readdir calls, and say we don't support them? Would anyone scream if we > did that? I'm not an NFSv4 expert (hence my initial incorrect assertion about NFSv4 not supporting readdirplus at all). I defer to those who are actually working on the standard and Linux implementation (Bruce?) But typically masking out these features could potentially cause severe interoperability problems for certain client implementations. We can only know for sure after a lot of testing at multivendor events like Connectathon; it's not something I would disable cavalierly. I believe the server can also indicate to clients that NFSv3 READDIRPLUS is not supported, and that wouldn't cause as much of a disaster for clients. It would even be feasible to disable READDIRPLUS only for certain physical file systems that would have a problem with lookup-during-filldir. I rather prefer making NFSD do the right thing here -- it seems to localize and document the issue and provide a solution that all file systems can use with a minimum of real fuss. I know that OCFS2 has this locking inversion issue as well, and we know OCFS2 users do share data from it with NFS. So Oracle is interested in a good solution here too. -- Chuck Lever -- To unsubscribe from this list: send the line "unsubscribe linux-nfs" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html