Re: [PATCH] coda: file count cannot be used to discover last close

Jan Harkes <jaharkes@xxxxxxxxxx> · Fri, 20 Jul 2007 10:26:51 -0400

On Fri, Jul 20, 2007 at 06:38:07AM +0100, Al Viro wrote:
> On Fri, Jul 20, 2007 at 12:10:00AM -0400, Jan Harkes wrote:
> > I will try to find a clean way to block the close syscall until fput
> > drops the last reference. However I realize that such an implementation
> > would not be acceptable for other file systems, and there are some
> > interesting unresolved details such as 'which close is going to be the
> > last close'.
> 
> Simply impossible.

As usual you are correct.

Originally Coda only used the CODA_CLOSE upcall which was called from
fops->release (fput). The problem was that we had various errors that
never managed to make it back to the user. So I split up the operation
into the part that does the write back and is the source of most errors.
(CODA_STORE, called from fops->flush) and the part that drops the last
reference (CODA_RELEASE, called from fops->release).

I've actually only used the _STORE/_RELEASE upcalls in a development
version, so the old code is not only still around, it is the typically
used variant.

I'll submit a patch that removes those upcalls, they will never work the
way I hoped.

> Why does CODA need special warranties in that area, anyway?

I don't think it really needs special warranties. The intent is to write
back data only when the last reference has been released. Ideally we
would like to return errors that occur during this writeback to the
application. Clearly the latter isn't possible.

> Related question: does fsync() force the writeback?

I think it should, but at the moment it does not. My guess on why the
implementation isn't there is that it mesh well with the last-writer
close model. The cache manager actually doesn't trigger any write back
operations until the owrite counter drops to 0 and this happens only
after all open for write descriptors for a file have been released. As a
result it was a bit of a hack to get the _STORE/_RELEASE variant to work
because it couldn't rely on the counter dropping to 0, which is one of
the reasons why I never released a Coda client that actually uses those
new upcalls.

We do have an upcall that is sent when fsync is called and could use
that. For fsync we could ignore that owrite counter. I am not sure if
the in-kernel implementation is correct, I would expect that it should
be an atomic operation wrt. other writers to be able to guarantee that
the resulting file actually contains what we expect but right now it
looks like it does grab the inode mutex when it syncs the file to disk
but releases the lock before it sends an upcall to the cache manager.

Jan

-
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html