Re: Interrupted system call

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




> [...]
It would be interesting to know which syscall is
actually failing. Running the failure case under "strace" would be
interesting (likewise to see which signal is causing the interruption).
> [...]


First of all, thanks for your help.

GIT_TRACE alone does not tell me anything useful:

$ GIT_TRACE=true git fsck
07:58:47.229138 git.c:442               trace: built-in: git fsck
error: unable to mmap ./objects/cb/fec04963c1090535d2670b741912e17fd27b27: Interrupted system call
error: cbfec04963c1090535d2670b741912e17fd27b27: object corrupt or missing: ./objects/cb/fec04963c1090535d2670b741912e17fd27b27
Checking object directories: 100% (256/256), done.
Checking objects: 100% (70229/70229), done.
Checking connectivity: 75316, done.
missing commit cbfec04963c1090535d2670b741912e17fd27b27
dangling commit 6835e962b227e957520addbc5c28aedc97b253f3
dangling tree a9d1a1321066d8a8402f1c9e584675146d250952


GIT_TRACE_FSMONITOR does not either:

$ GIT_TRACE_FSMONITOR=true git fsck
error: unable to mmap ./objects/56/af267465e7cdb7ccebe8242e55c03d4b675684: Interrupted system call
error: 56af267465e7cdb7ccebe8242e55c03d4b675684: object corrupt or missing: ./objects/56/af267465e7cdb7ccebe8242e55c03d4b675684
Checking object directories: 100% (256/256), done.
Checking objects: 100% (70229/70229), done.
Checking connectivity: 75666, done.
missing tree 56af267465e7cdb7ccebe8242e55c03d4b675684

It is the same Git repository, so it looks like every time a different, random file fails.


I managed to make it fail once with:

  strace -f -- git fsck --progress

The signal involved is SIGALRM. I am guessing that Git is setting it up in order to display its progress messages. This is one of the few calls to rt_sigaction(SIGALRM):

rt_sigaction(SIGALRM, {sa_handler=0x556c8ac0fe80, sa_mask=[], sa_flags=SA_RESTORER|SA_RESTART, sa_restorer=0x7fbdca7da890}, NULL, 8) = 0


This is the first failure:

openat(AT_FDCWD, "./objects/11/a327f469cc40015d6d873f6eed328e977c4234", O_RDONLY|O_CLOEXEC) = -1 EINTR (Interrupted system call)
--- SIGALRM {si_signo=SIGALRM, si_code=SI_KERNEL} ---
rt_sigreturn({mask=[]})                 = -1 EINTR (Interrupted system call)
openat(AT_FDCWD, "/usr/share/locale/en_US/LC_MESSAGES/libc.mo", O_RDONLY) = -1 ENOENT (No such file or directory)
openat(AT_FDCWD, "/usr/share/locale/en/LC_MESSAGES/libc.mo", O_RDONLY) = -1 ENOENT (No such file or directory)
openat(AT_FDCWD, "/usr/share/locale-langpack/en_US/LC_MESSAGES/libc.mo", O_RDONLY) = -1 ENOENT (No such file or directory)
openat(AT_FDCWD, "/usr/share/locale-langpack/en/LC_MESSAGES/libc.mo", O_RDONLY) = -1 ENOENT (No such file or directory)
write(2, "error: unable to mmap ./objects/"..., 99error: unable to mmap ./objects/11/a327f469cc40015d6d873f6eed328e977c4234: Interrupted system call
) = 99
write(2, "error: 11a327f469cc40015d6d873f6"..., 128error: 11a327f469cc40015d6d873f6eed328e977c4234: object corrupt or missing: ./objects/11/a327f469cc40015d6d873f6eed328e977c4234
) = 128


This is the second one:

openat(AT_FDCWD, "./objects/18/5b82729943708795b635899348ecca97aa7804", O_RDONLY|O_CLOEXEC) = -1 EINTR (Interrupted system call)
--- SIGALRM {si_signo=SIGALRM, si_code=SI_KERNEL} ---
rt_sigreturn({mask=[]})                 = -1 EINTR (Interrupted system call)
write(2, "error: unable to mmap ./objects/"..., 99error: unable to mmap ./objects/18/5b82729943708795b635899348ecca97aa7804: Interrupted system call
) = 99
write(2, "error: 185b82729943708795b635899"..., 128error: 185b82729943708795b635899348ecca97aa7804: object corrupt or missing: ./objects/18/5b82729943708795b635899348ecca97aa7804
) = 128

There are a few more failures.

This is the last one. Afterwards, Git exited:

openat(AT_FDCWD, "./objects/f4/56439700761946c57ef467a8a125a80f0304bd", O_RDONLY|O_CLOEXEC) = -1 EINTR (Interrupted system call)
--- SIGALRM {si_signo=SIGALRM, si_code=SI_KERNEL} ---
rt_sigreturn({mask=[]})                 = -1 EINTR (Interrupted system call)
openat(AT_FDCWD, "./objects/pack", O_RDONLY|O_NONBLOCK|O_CLOEXEC|O_DIRECTORY) = 3
fstat(3, {st_mode=S_IFDIR|0755, st_size=0, ...}) = 0
brk(0x556c934af000)                     = 0x556c934af000
getdents(3, /* 19 entries */, 1048576)  = 1272
getdents(3, /* 0 entries */, 1048576)   = 0
close(3)                                = 0
write(2, "fatal: failed to read object f45"..., 95fatal: failed to read object f456439700761946c57ef467a8a125a80f0304bd: Interrupted system call
) = 95
exit_group(128)                         = ?
+++ exited with 128 +++


I am not an expert in Unix signals, but I'll do my best here.

I do not understand why Git is getting these interruptions due to SIGALRM, because SA_RESTART is in place.

Interestingly, the man page signal(7) does list open() under that flag, but not openat().

The description for open() under SA_RESTART is also interesting:

* open(2), if it can block (e.g., when opening a FIFO; see fifo(7)).

I am not sure that opening a normal disk file may qualify as "can block" with that definition though.

Best regards,
  rdiez



[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux