Re: Libvirt hosts freeze after ceph osd+mon problem

Wido den Hollander <wido@xxxxxxxx> · Tue, 7 Nov 2017 10:42:13 +0100 (CET)

> Op 7 november 2017 om 10:14 schreef Jan Pekař - Imatic <jan.pekar@xxxxxxxxx>:
> 
> 
> Additional info - it is not librbd related, I mapped disk through
> rbd map and it was the same - virtuals were stuck/frozen.
> I happened exactly when in my log appeared
> 

Why aren't you using librbd? Is there a specific reason for that? With Qemu/KVM/libvirt I always suggest to use librbd.

And in addition, what kernel version are you running?

Wido

> Nov  7 10:01:27 imatic-hydra01 kernel: [2266883.493688] libceph: osd6 down
> 
> I can attach with strace to qemu process and I can get this running in loop:
> 
> root@imatic-hydra01:/usr/local/libvirt/bin# strace -p 31963
> strace: Process 31963 attached
> ppoll([{fd=3, events=POLLIN}, {fd=5, events=POLLIN}, {fd=7, 
> events=POLLIN}, {fd=8, events=POLLIN}, {fd=45, events=POLLIN}, {fd=46, 
> events=POLLIN}], 6, {tv_sec=0, tv_nsec=355313847}, NULL, 8) = 0 (Timeout)
> poll([{fd=10, events=POLLOUT}], 1, 0)   = 1 ([{fd=10, 
> revents=POLLOUT|POLLHUP}])
> ppoll([{fd=3, events=POLLIN}, {fd=5, events=POLLIN}, {fd=7, 
> events=POLLIN}, {fd=8, events=POLLIN}, {fd=45, events=POLLIN}, {fd=46, 
> events=POLLIN}], 6, {tv_sec=1, tv_nsec=0}, NULL, 8) = 0 (Timeout)
> poll([{fd=10, events=POLLOUT}], 1, 0)   = 1 ([{fd=10, 
> revents=POLLOUT|POLLHUP}])
> ppoll([{fd=3, events=POLLIN}, {fd=5, events=POLLIN}, {fd=7, 
> events=POLLIN}, {fd=8, events=POLLIN}, {fd=45, events=POLLIN}, {fd=46, 
> events=POLLIN}], 6, {tv_sec=0, tv_nsec=493273904}, NULL, 8) = 0 (Timeout)
> Process 31963 detached
>   <detached ...>
> 
> Can you please give me brief info, what should I debug and how can I do 
> that? I'm newbie in gdb debugging.
> It is not problem inside the virtual machine (like disk not responding) 
> because I can't even get to VNC console and there is no kernel panic 
> visible on it. Also I suppose kernel should ping without disk being 
> available.
> 
> Thank you
> 
> With regards
> Jan Pekar
> 
> 
> 
> On 7.11.2017 00:30, Jason Dillaman wrote:
> > If you could install the debug packages and get a gdb backtrace from all 
> > threads it would be helpful. librbd doesn't utilize any QEMU threads so 
> > even if librbd was deadlocked, the worst case that I would expect would 
> > be your guest OS complaining about hung kernel tasks related to disk IO 
> > (since the disk wouldn't be responding).
> > 
> > On Mon, Nov 6, 2017 at 6:02 PM, Jan Pekař - Imatic <jan.pekar@xxxxxxxxx 
> > <mailto:jan.pekar@xxxxxxxxx>> wrote:
> > 
> >     Hi,
> > 
> >     I'm using debian stretch with ceph 12.2.1-1~bpo80+1 and qemu
> >     1:2.8+dfsg-6+deb9u3
> >     I'm running 3 nodes with 3 monitors and 8 osds on my nodes, all on IPV6.
> > 
> >     When I tested the cluster, I detected strange and severe problem.
> >     On first node I'm running qemu hosts with librados disk connection
> >     to the cluster and all 3 monitors mentioned in connection.
> >     On second node I stopped mon and osd with command
> > 
> >     kill -STOP MONPID OSDPID
> > 
> >     Within one minute all my qemu hosts on first node freeze, so they
> >     even don't respond to ping. On VNC screen there is no error (disk or
> >     kernel panic), they just hung forever with no console response. Even
> >     starting MON and OSD on stopped host doesn't make them running.
> >     Destroying the qemu domain and running again is the only solution.
> > 
> >     This happens even if virtual machine has all primary OSD on other
> >     OSDs from that I have stopped - so it is not writing primary to the
> >     stopped OSD.
> > 
> >     If I stop only OSD and MON keep running, or I stop only MON and OSD
> >     keep running everything looks OK.
> > 
> >     When I stop MON and OSD, I can see in log  osd.0 1300
> >     heartbeat_check: no reply from ... as usual when OSD fails. During
> >     this are virtuals still running, but after that they all stop.
> > 
> >     What should I send you to debug this problem? Without fixing that,
> >     ceph is not reliable to me.
> > 
> >     Thank you
> >     With regards
> >     Jan Pekar
> >     Imatic
> >     _______________________________________________
> >     ceph-users mailing list
> >     ceph-users@xxxxxxxxxxxxxx <mailto:ceph-users@xxxxxxxxxxxxxx>
> >     http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> >     <http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com>
> > 
> > 
> > 
> > 
> > -- 
> > Jason
> 
> -- 
> ============
> Ing. Jan Pekař
> jan.pekar@xxxxxxxxx | +420603811737
> ----
> Imatic | Jagellonská 14 | Praha 3 | 130 00
> http://www.imatic.cz
> ============
> --
> _______________________________________________
> ceph-users mailing list
> ceph-users@xxxxxxxxxxxxxx
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com