Re: Commands failing silently?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]



Dan Bongert wrote:
Filipe Brandenburger wrote:
Hi,

On Tue, Mar 25, 2008 at 2:21 PM, Dan Bongert <dbongert@xxxxxxxx> wrote:
 thoth(3) /tmp> ls

 thoth(4) /tmp> echo $?
 141

141 is SIGPIPE. If the process is killed by a signal, the return code
will be 128+signal number. 141-128=13, and kill -l says: 13) SIGPIPE.

SIGPIPE means that something that ls is writing to is being closed.
That's really strange, and I couldn't find why.

I still think strace would be the best way to trace it. Please try:

# rm -f /tmp/ls-strace.txt; strace -o /tmp/ls-strace.txt -tt -s 1024
-f ls --color=tty

Repeat it until ls doesn't print anything. Then less your
/tmp/ls-strace.txt file, you'll probably have something like +++
killed by SIGPIPE +++ as the last line of it. Then try to figure out
what happened before it got the SIGPIPE. Probably a "write" to
something, try to figure out to which file descriptor. If you can't do
it, try to post the last few lines of the file here.

I tried it, but as I said before, strace somehow interferes with what's going on. I wasn't able to get a program to fail via strace.

Also, can you post the output of this command?
# ls -la /proc/$$/fd/

thoth(265) /tmp> ls -la /proc/$$/fd/

thoth(266) /tmp> ls -la /proc/$$/fd/
total 5
dr-x------  2 dbongert dbongert  0 Mar 27 10:17 .
dr-xr-xr-x  3 dbongert dbongert  0 Mar 27 10:03 ..
lrwx------  1 dbongert dbongert 64 Mar 27 10:17 0 -> /dev/pts/0
lrwx------  1 dbongert dbongert 64 Mar 27 10:17 1 -> /dev/pts/0
lrwx------  1 dbongert dbongert 64 Mar 27 10:17 2 -> /dev/pts/0
lrwx------  1 dbongert dbongert 64 Mar 27 10:17 255 -> /dev/pts/0
lrwx------  1 dbongert dbongert 64 Mar 27 10:17 3 -> socket:[4425494]


Ok, here I am replying to myself. On a lark, I tried to strace a different program, since I couldn't get strace + ls to fail. Here's the end of the output from 'strace w':

connect(4, {sa_family=AF_FILE, path="/var/run/nscd/socket"}, 110) = 0
poll([{fd=4, events=POLLOUT|POLLERR|POLLHUP, revents=POLLOUT|POLLHUP}], 1, 5000) = 1 writev(4, [{"\2\0\0\0\1\0\0\0\2\0\0\0", 12}, {"0\0", 2}], 2) = -1 EPIPE (Broken pipe)
--- SIGPIPE (Broken pipe) @ 0 (0) ---
+++ killed by SIGPIPE +++

Looks like a nscd problem, and disabling it seems to fix the problem.

--
Dan Bongert                     dbongert@xxxxxxxx

_______________________________________________
CentOS mailing list
CentOS@xxxxxxxxxx
http://lists.centos.org/mailman/listinfo/centos

[Index of Archives]     [CentOS]     [CentOS Announce]     [CentOS Development]     [CentOS ARM Devel]     [CentOS Docs]     [CentOS Virtualization]     [Carrier Grade Linux]     [Linux Media]     [Asterisk]     [DCCP]     [Netdev]     [Xorg]     [Linux USB]
  Powered by Linux