RE: [Xen-devel] Xen blktap driver for Ceph RBD : Anybody wants to test ? :p

James Harper <james.harper@xxxxxxxxxxxxxxxx> · Mon, 12 Aug 2013 23:26:00 +0000

> >> > > tapdisk[9180]: segfault at 7f7e3a5c8c10 ip 00007f7e387532d4 sp
> >> > 00007f7e3a5c8c10 error 4 in libpthread-2.13.so[7f7e38748000+17000]
> >> > > tapdisk:9180 blocked for more than 120 seconds.
> >> > > tapdisk         D ffff88043fc13540     0  9180      1 0x00000000
> 
> You can try generating a core file by changing the ulimit on the running
> process
> 
> A backtrace would be useful :)
> 

I found it was actually dumping core in /, but gdb doesn't seem to work nicely and all I get is this:

warning: Can't read pathname for load map: Input/output error.
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
Cannot find new threads: generic error
Core was generated by `tapdisk'.
Program terminated with signal 11, Segmentation fault.
#0  pthread_cond_wait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:163
163     ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S: No such file or directory.

Even when I attach to a running process.

One VM segfaults on startup, pretty much everytime except never when I attach strace to it, meaning it's probably a race condition and may not actually be in your code...

> 
> > Actually maybe not. What I was reading only applies for large number of
> > bytes written to the pipe, and even then I got confused by the double
> > negatives. Sorry for the noise.
> 
> Yes, as you discovered but size < PIPE_BUF, they should be atomic even
> in non-blocking mode. But I could still add assert() there to make
> sure it is.

Nah I got that completely backwards. I see now you are only passing a pointer so yes it should never be non-atomic.

> I did find a bug where it could "leak" requests which may lead to
> hang. But it shouldn't crash ...
> 
> Here's an (untested yet) patch in the rbd error path:
> 

I'll try that later this morning when I get a minute.

I've done the poor-mans-debugger thing and riddled the code with printf's but as far as I can determine every routine starts and ends. My thinking at the moment is that it's either a race (the VM's most likely to crash have multiple disks), or a buffer overflow that trips it up either immediately, or later.

I have definitely observed multiple VM's crash when something in ceph hiccup's (eg I bring a mon up or down), if that helps.

I also followed through the rbd_aio_release idea on the weekend - I can see that if the read returns failure it means the callback was never called so the release is then the responsibility of the caller.

Thanks

James

--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html