Linus Torvalds wrote:
On Tue, 24 Mar 2009, Peter wrote:
What OS? What filesystem? Are you perhaps running out of space?
Its Lenny Debian 5.0.0, Diskspace is ample . Filesystem is cifs ( this is a
windows 2000 share mounted with samba in a VMware Workstation Debian client (
yes, I know ... )). Memory usage, according to htop, is constant = 140/504 MB
during the whole process until git fails.
Ok, it sounds very much like a possible CIFS problem.
Getting the exact error code for the "close()" will be interesting,
because the only thing that can return an error under Linux in close() is
if the filesystem "->flush()" function fails with an error.
In cifs, the cifs_flush() thing does a filemap_fdatawrite(), forcing the
data out, and that in turn calls do_writepages() which in turn calls
either generic_writepages() or cifs_writepages() depending on random cifs
crap.
I'm not seeing any obvious errors there. But I would _not_ be surprised if
the fchmod(fd, 0444) that we did before the close could be causing this:
cifs gets confused and thinks that it must not write to the file because
the file has been turned read-only.
Here's an idea: if this is reproducible for you, does the behavior change
if you do
[core]
core.fsyncobjectfiles = true
in your .git/config file? That causes git to always fsync() the written
data _before_ it does that fchmod(), which in turn means that by the time
the close() rolls around, there should be no data to write, and thus no
possibility that anybody gets confused when there is still dirty data on a
file that has been marked read-only.
Anyway, I'm cc'ing Steve French and Jeff Layton, as the major CIFS go-to
guys. It does seem like a CIFS bug is likely.
Steve, Jeff: git does basically
open(".git/objects/xy/tmp_obj_xyzzy", O_RDWR|O_CREAT|O_EXCL, 0600) = 5
write(5, ".."..., len)
fchmod(5, 0444)
close(5)
link(".git/objects/xy/tmp_obj_xyzzy", ".git/objects/xy/xyzzy");
unlink(".git/objects/xy/tmp_obj_xyzzy");
to write a new datafile. Under CIFS, that "close()" apparently sometimes
returns with an error, and fails.
I guess we could change the "fchmod()" to happen after the close(), just
to make it easier for filesystems to get this right. And yes, as outlined
above, there's a config option to make git use fdatasync() before it does
that fchmod() too. But it does seem like CIFS is just buggy.
If CIFS has problems with the above sequence (say, if some timeout
refreshes the inode data or causes a re-connect with the server or
whatever), then maybe cifs should always do an implicit fdatasync() when
doing fchmod(), just to make sure that the fchmod won't screw up any
cached dirty data?
Linus
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Hi
Thanks a lot , I will check that out tomorrow, in the meantime, this is
the result of your patch being applied:
$ git add <big stuff>
fatal: error when closing sha1 file (Bad file descriptor)
Peter
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html