RE: Collective wisdom about repos on NFS accessed by concurrent clients (== corruption!?)

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



> -----Original Message-----
> From: Thomas Rast
> Sent: Saturday, April 06, 2013 4:12
> 
> Kenneth Ölwing <kenneth@xxxxxxxxx> writes:
> 
> > On 2013-04-05 15:42, Thomas Rast wrote:
> >> Can you run the same tests under strace or similar, and gather the 
> >> relevant outputs? Otherwise it's probably very hard to say what is 
> >> going wrong. In particular we've had some reports on lustre that 
> >> boiled down to "impossible" returns from libc functions, not git 
> >> issues. It's hard to say without some evidence.
> > Thomas, thanks for your reply.
> >
> > I'm assuming I should strace the git commands as they're 
> issued? I'm 
> > already collecting regular stdout/err output in a log as I go. Is 
> > there any debugging things I can turn on to make the calls issue 
> > internal tracing of some sort?
> 
> I don't think there's any internal debugging that helps at this point.
> Usually errors pointing to corruption are caused by a chain 
> of syscalls failing in some way, and the final error shows 
> only the last one, so
> strace() output is very interesting.
> 
> > The main issue I see is that I suspect it will generate so 
> much data 
> > that it'll overflow my disk ;-).
> 
> Well, assuming you have some automated way of detecting when 
> it fails, you can just overwrite the same strace output file 
> repeatedly; we're only interested in the last one (or all the 
> last ones if several gits fail concurrently).

We use tmpwatch for this type of issue, especially with oracle traces. Set up a
directory and tell tmpwatch to delete files older than X. This will keep the
files at bay and when you detect a problem stop  the clean up script.

> 
> Fiddling with strace will unfortunately change the timings 
> somewhat (causing a bunch of extra context switches per 
> syscall), but I hope that you can still get it to reproduce.



--
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
-                                                               -
- Jason Pyeron                      PD Inc. http://www.pdinc.us -
- Principal Consultant              10 West 24th Street #100    -
- +1 (443) 269-1555 x333            Baltimore, Maryland 21218   -
-                                                               -
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
This message is copyright PD Inc, subject to license 20080407P00.

 

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html




[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]