On Sun, 2007-11-11 at 23:57 +0100, Jos Vos wrote: > Hi, > > I have a node that has an unkillable (kill -9 doesn't work) clurgmgrd > running. I have fenced it now for the third time, with the same > result after startup... > > Stracing clutstat gives: > > [...] > socket(PF_FILE, SOCK_STREAM, 0) = 5 > connect(5, {sa_family=AF_FILE, path="/var/run/cluster/rgmanager.sk"}, 110) = -1 ENOENT (No such file or directory) > close(5) = 0 > dup(2) = 5 > fcntl(5, F_GETFL) = 0x8002 (flags O_RDWR|O_LARGEFILE) > fstat(5, {st_mode=S_IFCHR|0620, st_rdev=makedev(136, 0), ...}) = 0 > mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x2aaaaaaac000 > lseek(5, 0, SEEK_CUR) = -1 ESPIPE (Illegal seek) > write(5, "msg_open: No such file or direct"..., 36msg_open: No such file or directory > ) = 36 > close(5) = 0 > munmap(0x2aaaaaaac000, 4096) = 0 > [...] > > How to get this node back up again??? > > This is on a RHEL 5.0 clone. If it's unkillable, it's stuck waiting on the kernel for something. echo 1 > /proc/sys/kernel/sysrq echo t > /proc/sysrq-trigger dmesg > foo.out reply + attach foo.out ;) -- Lon -- Linux-cluster mailing list Linux-cluster@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/linux-cluster