On Fri, 1 Dec 2006, Trond Myklebust wrote:
Also, it's a tiring and trivial example, but even the 'ls -al' scenario
isn't ideally addressed by readdir()+statlite(), since statlite() might
return size/mtime from before 'ls -al' was executed by the user.
stat() will do the same.
It does with NFS, but only because NFS doesn't follow POSIX in that
regard. In general, stat() is supposed to return a value that's
accurate at the time of the call.
(Although now I'm confused again. If you're assuming stat() can return
cached results, why do you think statlite() is useful?)
Currently, you will never get anything other than weak consistency with
NFS whether you are talking about stat(), access(), getacl(),
lseek(SEEK_END), or append(). Your 'permitting it' only in statlite() is
irrelevant to the facts on the ground: I am not changing the NFS client
caching model in any way that would affect existing applications.
Clearly, if you cache attributes on the client and provide only weak
consistency, then readdirplus() doesn't change much. But _other_ non-NFS
filesystems may elect to provide POSIX semantics and strong consistency,
even though NFS doesn't. And the interface simply doesn't allow that to
be done efficiently in distributed environments, because applications
can't communicate their varying consistency needs. Instead, systems like
NFS weaken attribute consistency globally. That works well enough for
most people most of the time, but it's hardly ideal.
readdirplus() allows applications like 'ls -al' to distinguish themselves
from applications that want individually accurate stat() results. That in
turn allows distributed filesystems that are both strongly consistent
_and_ efficient at scale. In most cases, it'll trivially turn into a
readdir()+stat() in the VFS, but in some cases filesystems can exploit
that information for (often enormous) performance gain, while still
maintaining well-defined consistency semantics. readdir() already leaks
some inode information into it's result (via d_type)... I'm not sure I
understand the resistance to providing more.
sage
-
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html