Re: [PATCH] cifs: Allow nfsd over cifs

Steve French <smfrench@xxxxxxxxx> · Tue, 1 Mar 2011 14:11:32 -0600

I vaguely remember that the same problem can occur with nfsd on a
local file system and that nfs clients have to be able to recover from
ESTALE (e.g. lookup/delete/create/lookup would fail with ESTALE
otherwise).   Don't clients handle ESTALE by relooking up the file ala

http://lwn.net/Articles/272684/

On Tue, Mar 1, 2011 at 2:09 PM, Jeff Layton <jlayton@xxxxxxxxxx> wrote:
> On Tue, 1 Mar 2011 14:21:03 -0500
> "J. Bruce Fields" <bfields@xxxxxxxxxxxx> wrote:
>
>> On Tue, Mar 01, 2011 at 12:30:12PM -0600, Shirish Pargaonkar wrote:
>> > On Tue, Mar 1, 2011 at 12:07 PM, J. Bruce Fields <bfields@xxxxxxxxxxxx> wrote:
>> > > On Mon, Feb 28, 2011 at 10:28:08PM -0500, J. Bruce Fields wrote:
>> > >> On Mon, Feb 28, 2011 at 08:33:08PM -0600, Steve French wrote:
>> > >> > On Mon, Feb 28, 2011 at 5:59 PM, J. Bruce Fields <bfields@xxxxxxxxxxxx> wrote:
>> > >> > > On Mon, Feb 28, 2011 at 05:52:09PM -0600, Steve French wrote:
>> > >> > >> On Mon, Feb 28, 2011 at 5:42 PM, J. Bruce Fields <bfields@xxxxxxxxxxxx> wrote:
>> > >> > >> > OK.  Then as things stand we're stuck returning ESTALE to the client
>> > >> > >> > unless we happen to have the inode they're looking for in our cache?
>> > >> > >>
>> > >> > >> Yes - that seems right and consistent with what I remember other file
>> > >> > >> systems doing.
>> > >> > >
>> > >> > > No, other filesystems are able to look up the file on disk by inode
>> > >> > > number (or by whatever identifier they use in the filehandle).  They
>> > >> > > don't depend on already having the inode in core.
>> > >> >
>> > >> > Grepping for ESTALE looks like there are dozens of places in various
>> > >> > fs where ESTALE can get returned ...
>> > >>
>> > >> Certainly true.
>> > >>
>> > >> But they do have to be able to look up any inode, regardless of whether
>> > >> it is currently in cache.
>> > >>
>> > >> Otherwise applications on the client would see ESTALE after any server
>> > >> reboot, or any time an inode was forced out of cache (for whatever
>> > >> reason).
>> > >>
>> > >> That would be quite painful.
>> > >
>> > > So if my understanding of the cifs behavior here is correct, then I
>> > > don't believe nfs exports of cifs will be usable.
>> > >
>> > > In the long term perhaps it could be possible with changes to one or the
>> > > other of the protocols: for example, perhaps future versions of the nfs
>> > > protocol could be made less reliant on long-lived filehandles.  But that
>> > > would be a major change.
>> > >
>> > > --b.
>> > >
>> >
>> > Bruce, I am not getting a picture of how NFS server would return ESTALE
>> > error to nfs client and then on to the app for a filehandle fragment
>> > that happens
>> > to be coded as uniqueid by cifs during encode_fh if inode  with that uniqueid
>> > does not happen to exist in vfs cache on the server box.
>> >
>> > When nfs passes down  filehandle fragment (uniqueid) to cifs in fh_to_dentry,
>> > if the inode does not exist in cache, cifs will pick a new inode and copy this
>> > uniqueid as inode number.
>>
>> Oh, OK, that's not what I'd imagined.
>>
>
> Egads. Bruce is quite right and this is just plain not going to work.
> Here's one way that this will go wrong and can never be fixed. I'm
> fairly certain that there are others:
>
> Suppose that we set up a exported cifs mount. The NFS client mounts it,
> opens a file and starts writing data to the inode. So far so good --
> the client did a LOOKUP operation at some point, which got translated
> to a QPathInfo by cifs and a positive dentry ended up in cache.
>
> The client then buffers up some writes. Now, suppose that the server
> then reboots (or that the inode got pushed out of the cache since it
> wasn't in use). The server redoes the cifs mount and reexports it. The
> client then tries to flush the buffered writes. The server gets a write
> request but now the inode isn't in cache and we can't match up the
> filehandle to anything. Server returns -ESTALE.
>
> At that point, you're screwed. There's virtually no way for the client
> to recover. We can't guarantee that the NFS client will have a valid
> path to the inode even if it did want to retry the lookup, which it
> really shouldn't need to do anyway.
>
> The fact that the CIFS protocol has no way to look up inodes by
> anything like an inode number really makes this whole idea dead on
> arrival.
> --
> Jeff Layton <jlayton@xxxxxxxxxx>
>

-- 
Thanks,

Steve
--
To unsubscribe from this list: send the line "unsubscribe linux-cifs" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html