On Tue, 1 Mar 2011 14:21:03 -0500 "J. Bruce Fields" <bfields@xxxxxxxxxxxx> wrote: > On Tue, Mar 01, 2011 at 12:30:12PM -0600, Shirish Pargaonkar wrote: > > On Tue, Mar 1, 2011 at 12:07 PM, J. Bruce Fields <bfields@xxxxxxxxxxxx> wrote: > > > On Mon, Feb 28, 2011 at 10:28:08PM -0500, J. Bruce Fields wrote: > > >> On Mon, Feb 28, 2011 at 08:33:08PM -0600, Steve French wrote: > > >> > On Mon, Feb 28, 2011 at 5:59 PM, J. Bruce Fields <bfields@xxxxxxxxxxxx> wrote: > > >> > > On Mon, Feb 28, 2011 at 05:52:09PM -0600, Steve French wrote: > > >> > >> On Mon, Feb 28, 2011 at 5:42 PM, J. Bruce Fields <bfields@xxxxxxxxxxxx> wrote: > > >> > >> > OK. Then as things stand we're stuck returning ESTALE to the client > > >> > >> > unless we happen to have the inode they're looking for in our cache? > > >> > >> > > >> > >> Yes - that seems right and consistent with what I remember other file > > >> > >> systems doing. > > >> > > > > >> > > No, other filesystems are able to look up the file on disk by inode > > >> > > number (or by whatever identifier they use in the filehandle). They > > >> > > don't depend on already having the inode in core. > > >> > > > >> > Grepping for ESTALE looks like there are dozens of places in various > > >> > fs where ESTALE can get returned ... > > >> > > >> Certainly true. > > >> > > >> But they do have to be able to look up any inode, regardless of whether > > >> it is currently in cache. > > >> > > >> Otherwise applications on the client would see ESTALE after any server > > >> reboot, or any time an inode was forced out of cache (for whatever > > >> reason). > > >> > > >> That would be quite painful. > > > > > > So if my understanding of the cifs behavior here is correct, then I > > > don't believe nfs exports of cifs will be usable. > > > > > > In the long term perhaps it could be possible with changes to one or the > > > other of the protocols: for example, perhaps future versions of the nfs > > > protocol could be made less reliant on long-lived filehandles. But that > > > would be a major change. > > > > > > --b. > > > > > > > Bruce, I am not getting a picture of how NFS server would return ESTALE > > error to nfs client and then on to the app for a filehandle fragment > > that happens > > to be coded as uniqueid by cifs during encode_fh if inode with that uniqueid > > does not happen to exist in vfs cache on the server box. > > > > When nfs passes down filehandle fragment (uniqueid) to cifs in fh_to_dentry, > > if the inode does not exist in cache, cifs will pick a new inode and copy this > > uniqueid as inode number. > > Oh, OK, that's not what I'd imagined. > Egads. Bruce is quite right and this is just plain not going to work. Here's one way that this will go wrong and can never be fixed. I'm fairly certain that there are others: Suppose that we set up a exported cifs mount. The NFS client mounts it, opens a file and starts writing data to the inode. So far so good -- the client did a LOOKUP operation at some point, which got translated to a QPathInfo by cifs and a positive dentry ended up in cache. The client then buffers up some writes. Now, suppose that the server then reboots (or that the inode got pushed out of the cache since it wasn't in use). The server redoes the cifs mount and reexports it. The client then tries to flush the buffered writes. The server gets a write request but now the inode isn't in cache and we can't match up the filehandle to anything. Server returns -ESTALE. At that point, you're screwed. There's virtually no way for the client to recover. We can't guarantee that the NFS client will have a valid path to the inode even if it did want to retry the lookup, which it really shouldn't need to do anyway. The fact that the CIFS protocol has no way to look up inodes by anything like an inode number really makes this whole idea dead on arrival. -- Jeff Layton <jlayton@xxxxxxxxxx> -- To unsubscribe from this list: send the line "unsubscribe linux-cifs" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html