Re: Consolidate SHA1 object file close

Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx> · Wed, 11 Jun 2008 10:25:28 -0700 (PDT)

On Wed, 11 Jun 2008, Pierre Habouzit wrote:
> > 
> > Hmm. Very interesting. That definitely sounds like a cache coherency 
> > issue (ie the "fsck" probably doesn't really _do_ anything, it just 
> > delays things and possibly causes memory pressure to throw some stuff out 
> > of the cache).
> > 
> > What clients, what server?
> 
>   Server uses NFSv3 kernel server from Debian's 2.6.18 etch (up to
> date).

Ok, it's almost impossible to be a server-side issue then - I could 
imagine that if you had some fancy cluster server or something, but in 
that kind os straightforward situation the only thing that is going to 
matter is the client-side caching.

I stopped using NFS so long ago that the only case I ever worried about 
and knew anything about was the traditional v2 issues. But iirc, v3 does 
nothing much to change the caching rules (v4, on the other hand, does add 
delegations and explicit caching support).

> Clients are various Unbuntu/Debian's with at least 2.6.18 kernels, some 
> .22 .24 and .25.

I'll ask Trond if he has any comments on this from the NFS client side. We 
_did_ hit some other NFS client bug with git long ago, I forget what it 
was all about (pread/pwrite?).

What is quite odd, though, is that exactly because of how git works, I 
would normally expect each client to not even *try* to look up objects 
that are written by other clients until long long after they have been 
written.

IOW, access to new objects is not something a git client will do just 
because the object suddenly appears in a directory - after the file is 
written and closed, it will not just be moved to the right position, but 
there has to be *other* files modified (ie the refs) to tell other clients 
about the changes too!

And that matters because even though there can be things like local 
directory caches (and Linux does support negative caching - ie the caching 
of the fact that a file was *not* there), those caches should be empty 
simply because other clients that didn't create the file shouldn't even 
have tried to look up non-existent objects!

If it's a directory content caching issue, then adding an fsync() won't 
matter. In fact, the fsync() should matter only if the client who wrote 
the object didn't write it out to the server at close() time, and that 
really sounds very unlikely indeed. So my personal guess is that the 
fsync() won't make any difference at all.

Do you have people using special flags for your NFS mounts? And do you 
know if there is some pattern to the client kernel versions when the 
problem happens?

		Linus
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html