RE: debugging librbd async

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



> > > Of course, the old standby is to just crank up the logging detail and try
> > > to narrow down where the crash happens.  Have you tried that yet?
> >
> > I haven't touched the rbd code. Is increased logging a compile-time
> > option or a config option?
> 
> That is probably the first you should try then.  In the [client] section
> of ceph.conf on the node where tapdisk is running add something like
> 
>  [client]
>   debug rbd = 20
>   debug rados = 20
>   debug ms = 1
>   log file = /var/log/ceph/client.$name.$pid.log
> 
> and make sure the log directory is writeable.
> 

Excellent. How noisy are those levels likely to be?

Is it the consumer of librbd that reads those values? I mean all I need to do is restart tapdisk process and the logging should happen right?

> > > There is a probable issue with aio_flush and caching enabled that Mike
> > > Dawson is trying to reproduce.  Are you running with caching on or off?
> >
> > I have not enabled caching, and I believe it's disabled by default.
> 
> There is a fix for an aio hang that just hit the cuttlefish branch today
> that could conceivably be the issue.  It causes a hang on qemu but maybe
> tapdisk is more sensitive?  I'd make sure you're running with that in any
> case to rule it out.
> 

I switched to dumpling in the last few days to see if the problem existed there. Is the fix you mention in dumpling? I'm not yet running mission critical production code on ceph, just a secondary windows domain controller, secondary spam filter, and a few other machines that don't affect production if they crash.

I'm also testing valgrind at the moment, just basic memtest, but suddenly everything is quite stable even though it's under reasonable load right now. Stupid heisenbugs.

Thanks

James



--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html




[Index of Archives]     [CEPH Users]     [Ceph Large]     [Information on CEPH]     [Linux BTRFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]
  Powered by Linux