> -----Original Message----- > From: Ilya Dryomov [mailto:idryomov@xxxxxxxxx] > Sent: 30 June 2017 14:06 > To: Nick Fisk <nick@xxxxxxxxxx> > Cc: Ceph Users <ceph-users@xxxxxxxxxxxxxx> > Subject: Re: Kernel mounted RBD's hanging > > On Fri, Jun 30, 2017 at 2:14 PM, Nick Fisk <nick@xxxxxxxxxx> wrote: > > > > > >> -----Original Message----- > >> From: Ilya Dryomov [mailto:idryomov@xxxxxxxxx] > >> Sent: 29 June 2017 18:54 > >> To: Nick Fisk <nick@xxxxxxxxxx> > >> Cc: Ceph Users <ceph-users@xxxxxxxxxxxxxx> > >> Subject: Re: Kernel mounted RBD's hanging > >> > >> On Thu, Jun 29, 2017 at 6:22 PM, Nick Fisk <nick@xxxxxxxxxx> wrote: > >> >> -----Original Message----- > >> >> From: Ilya Dryomov [mailto:idryomov@xxxxxxxxx] > >> >> Sent: 29 June 2017 16:58 > >> >> To: Nick Fisk <nick@xxxxxxxxxx> > >> >> Cc: Ceph Users <ceph-users@xxxxxxxxxxxxxx> > >> >> Subject: Re: Kernel mounted RBD's hanging > >> >> > >> >> On Thu, Jun 29, 2017 at 4:30 PM, Nick Fisk <nick@xxxxxxxxxx> wrote: > >> >> > Hi All, > >> >> > > >> >> > Putting out a call for help to see if anyone can shed some light on > this. > >> >> > > >> >> > Configuration: > >> >> > Ceph cluster presenting RBD's->XFS->NFS->ESXi Running 10.2.7 on > >> >> > the OSD's and 4.11 kernel on the NFS gateways in a pacemaker > >> >> > cluster Both OSD's and clients are go into a pair of switches, > >> >> > single L2 domain (no sign from pacemaker that there is network > >> >> > connectivity > >> >> > issues) > >> >> > > >> >> > Symptoms: > >> >> > - All RBD's on a single client randomly hang for 30s to several > >> >> > minutes, confirmed by pacemaker and ESXi hosts complaining > >> >> > >> >> Hi Nick, > >> >> > >> >> What is a "single client" here? > >> > > >> > I mean a node of the pacemaker cluster. So all RBD's on the same > >> pacemaker node hang. > >> > > >> >> > >> >> > - Cluster load is minimal when this happens most times > >> >> > >> >> Can you post gateway syslog and point at when this happened? > >> >> Corresponding pacemaker excerpts won't hurt either. > >> > > >> > Jun 28 16:35:38 MS-CEPH-Proxy1 lrmd[2026]: warning: p_export_ceph- > >> ds1_monitor_60000 process (PID 17754) timed out > >> > Jun 28 16:35:43 MS-CEPH-Proxy1 lrmd[2026]: crit: p_export_ceph- > >> ds1_monitor_60000 process (PID 17754) will not die! > >> > Jun 28 16:43:51 MS-CEPH-Proxy1 lrmd[2026]: warning: > >> > p_export_ceph-ds1_monitor_60000:17754 - timed out after 30000ms > Jun > >> 28 16:43:52 MS-CEPH-Proxy1 IPaddr(p_vip_ceph-ds1)[28482]: INFO: > >> ifconfig > >> ens224:0 down > >> > Jun 28 16:43:52 MS-CEPH-Proxy1 lrmd[2026]: notice: p_vip_ceph- > >> ds1_stop_0:28482:stderr [ SIOCDELRT: No such process ] > >> > Jun 28 16:43:52 MS-CEPH-Proxy1 crmd[2029]: notice: Operation > >> p_vip_ceph-ds1_stop_0: ok (node=MS-CEPH-Proxy1, call=471, rc=0, cib- > >> update=318, confirmed=true) > >> > Jun 28 16:43:52 MS-CEPH-Proxy1 exportfs(p_export_ceph-ds1)[28499]: > >> INFO: Un-exporting file system ... > >> > Jun 28 16:43:52 MS-CEPH-Proxy1 exportfs(p_export_ceph-ds1)[28499]: > >> > INFO: unexporting 10.3.20.0/24:/mnt/Ceph-DS1 Jun 28 16:43:52 > >> > MS-CEPH-Proxy1 exportfs(p_export_ceph-ds1)[28499]: INFO: Unlocked > >> > NFS > >> export /mnt/Ceph-DS1 Jun 28 16:43:52 MS-CEPH-Proxy1 > >> exportfs(p_export_ceph-ds1)[28499]: INFO: Un-exported file system(s) > >> > Jun 28 16:43:52 MS-CEPH-Proxy1 crmd[2029]: notice: Operation > >> p_export_ceph-ds1_stop_0: ok (node=MS-CEPH-Proxy1, call=473, rc=0, > >> cib- update=319, confirmed=true) > >> > Jun 28 16:43:52 MS-CEPH-Proxy1 exportfs(p_export_ceph-ds1)[28549]: > >> INFO: Exporting file system(s) ... > >> > Jun 28 16:43:52 MS-CEPH-Proxy1 exportfs(p_export_ceph-ds1)[28549]: > >> > INFO: exporting 10.3.20.0/24:/mnt/Ceph-DS1 Jun 28 16:43:52 MS-CEPH- > >> Proxy1 exportfs(p_export_ceph-ds1)[28549]: INFO: directory > >> /mnt/Ceph-DS1 exported > >> > Jun 28 16:43:52 MS-CEPH-Proxy1 crmd[2029]: notice: Operation > >> p_export_ceph-ds1_start_0: ok (node=MS-CEPH-Proxy1, call=474, rc=0, > >> cib- update=320, confirmed=true) > >> > > >> > If I enable the read/write checks for the FS resource, they also > >> > timeout at > >> the same time. > >> > >> What about syslog that the above corresponds to? > > > > I get exactly the same "_monitor" timeout message. > > No "libceph: " or "rbd: " messages at all? No WARNs or hung tasks? > > > > > Is there anything logging wise I can do with the kernel client to log when an > IO is taking a long time. Sort of like the slow requests in Ceph, but client side? > > Nothing out of the box, as slow requests are usually not the client > implementation's fault. Can you put together a script that would snapshot > all files in /sys/kernel/debug/ceph/<cluster-fsid.client-id>/* > on the gateways every second and rotate on an hourly basis? One of those > files, osdc, lists in-flight requests. If that's empty when the timeouts occur > then it's probably not krbd. I've managed to manually dump osdc when one of the hangs occurred: cat /sys/kernel/debug/ceph/d027d580-d69d-48f4-9d28-9b1650b57cce.client31526289/osdc 4747768 osd75 17.7366b517 rb.0.4d983.238e1f29.0000000b72da set-alloc-hint,write 4747770 osd75 17.c3a5d697 rbd_data.157b149238e1f29.0000000000000014 set-alloc-hint,write 4747782 osd75 17.7366b517 rb.0.4d983.238e1f29.0000000b72da set-alloc-hint,write 4747792 osd75 17.65154603 rb.0.4d983.238e1f29.000000022551 set-alloc-hint,write 4747793 osd75 17.65154603 rb.0.4d983.238e1f29.000000022551 set-alloc-hint,write 4747803 osd75 17.7366b517 rb.0.4d983.238e1f29.0000000b72da set-alloc-hint,write 4747812 osd75 17.7366b517 rb.0.4d983.238e1f29.0000000b72da set-alloc-hint,write 4747823 osd75 17.7366b517 rb.0.4d983.238e1f29.0000000b72da set-alloc-hint,write 4747830 osd75 17.7366b517 rb.0.4d983.238e1f29.0000000b72da set-alloc-hint,write 4747837 osd75 17.7366b517 rb.0.4d983.238e1f29.0000000b72da set-alloc-hint,write 4747844 osd75 17.7366b517 rb.0.4d983.238e1f29.0000000b72da set-alloc-hint,write So from what you are saying, this is not a krbd problem as there are pending IO's in flight? > > What Maged said, and also can you clarify what those "read/write checks for > the FS resource" do exactly? read/write to local xfs on /dev/rbd* or further > up? The FS checks uses dd to write to the filesystem and then a combination of test and cat, to read back. > > Thanks, > > Ilya _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com