On Thu, Nov 20, 2008 at 8:39 AM, Jeff Layton <jlayton@xxxxxxxxxx> wrote: > On Thu, 20 Nov 2008 08:04:08 -0600 > "Steve French" <smfrench@xxxxxxxxx> wrote: > >> On Thu, Nov 20, 2008 at 7:02 AM, Jeff Layton <jlayton@xxxxxxxxxx> wrote: >> > On Wed, 19 Nov 2008 23:24:47 -0600 >> > "Steve French" <smfrench@xxxxxxxxx> wrote: >> > >> >> On Wed, Nov 19, 2008 at 6:04 AM, Jeff Layton <jlayton@xxxxxxxxxx> wrote: >> >> > On Tue, 18 Nov 2008 21:46:59 -0600 >> >> > "Steve French" <smfrench@xxxxxxxxx> wrote: >> >> > >> >> >> In hunting down why we could get EBADF returned on close in some cases >> >> >> after reconnect, I found out that cifs_close was checking to see if >> >> >> the share (mounted server export) was valid (didn't need reconnect due >> >> >> to session crash/timeout) but we weren't checking if the handle was >> >> >> valid (ie the share was reconnected, but the file handle was not >> >> >> reopened yet). It also adds some locking around the updates/checks of >> >> >> the cifs_file->invalidHandle flag >> >> >> >> >> >> >> > >> >> > Do we need a lock around this check for invalidHandle? Could this race >> >> > with mark_open_files_invalid()? >> >> The attached patch may reduce the window of opportunity for the >> >> race you describe. Do you think we need another flag? (one >> >> to keep requests other than a write retry from using this >> >> handle, and one to prevent reopen when the handle is about to be closed >> >> after we have given up on write retries getting through? >> >> >> > >> > >> > So that I make sure I understand the problem... >> > >> > We have a file that is getting ready to be closed (closePend is set), >> > but the tcon has been reconnected and the filehandle is now invalid. >> > You only want to reopen the file in order to flush data out of the >> > cache, but only if there are actually dirty pages to be flushed. >> I don't think we have to worry about normal case of flushing dirty pages, that >> happens already before we get to cifs_close (fput calls flush/fsync). >> The case I was thinking about was a write on this handle that >> has hung, reconnected, and we are waiting for this pending write to complete. >> >> > If closePend is set then the userspace filehandle is already dead? No >> > further pages can be dirtied, right? >> They could be dirtied from other handles, and writepages picks >> the first handle that it can since writepages does not >> specify which handle to use (writepages won't pick a handle that >> that is close pending and it may be ok on retry because we look >> for a valid handle each time we retry so shouldn't pick this one) >> > > Right, I was assuming that the inode has no other open filehandles... > > Even if there are any other open filehandles though, we still want to > flush whatever dirty pages we have, correct? Or at least start > writeback on them... > >> > Rather than a new flag, I suggest checking for whether there are dirty >> > pages attached to the inode. If so, then we'll want to reopen the file >> > and flush it before finally closing it. >> There shouldn't be dirty pages if this is the last handle on the inode >> being closed >> > > At the time that the "release" op is called (which is cifs_close in > this case), there may still be dirty pages, even if this is the last > filehandle, right? I don't see how we could have dirty pages on that inode, filemap_fdatawrite was called (by cifs_flush) before we got to release and writes on different handles would not have oplock (if there are any other handles) and we would call filemap_fdatawrite on each of those (non-cached) writes on another handle. > If so then it seems reasonable to just check to see if there are any > dirty pages, reopen the file and start writeback if so. > > Alternately, I suppose we could consider skipping the reopen/writeback > if there are other open filehandles for the inode. The idea would be > that we could assume that the pages would get flushed when the last fh > is closed. I'm not sure if this violates any close-to-open attribute > semantics though. I don't think it matters much. We only have the write pending flag when we are actually using the file handle (find_writable_file increments it) for write ... if we failed timing out on write to that handle we would use a different handle or fail. -- Thanks, Steve -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html