On Sat, Jul 1, 2017 at 9:29 AM, Nick Fisk <nick@xxxxxxxxxx> wrote: >> -----Original Message----- >> From: Ilya Dryomov [mailto:idryomov@xxxxxxxxx] >> Sent: 30 June 2017 14:06 >> To: Nick Fisk <nick@xxxxxxxxxx> >> Cc: Ceph Users <ceph-users@xxxxxxxxxxxxxx> >> Subject: Re: Kernel mounted RBD's hanging >> >> On Fri, Jun 30, 2017 at 2:14 PM, Nick Fisk <nick@xxxxxxxxxx> wrote: >> > >> > >> >> -----Original Message----- >> >> From: Ilya Dryomov [mailto:idryomov@xxxxxxxxx] >> >> Sent: 29 June 2017 18:54 >> >> To: Nick Fisk <nick@xxxxxxxxxx> >> >> Cc: Ceph Users <ceph-users@xxxxxxxxxxxxxx> >> >> Subject: Re: Kernel mounted RBD's hanging >> >> >> >> On Thu, Jun 29, 2017 at 6:22 PM, Nick Fisk <nick@xxxxxxxxxx> wrote: >> >> >> -----Original Message----- >> >> >> From: Ilya Dryomov [mailto:idryomov@xxxxxxxxx] >> >> >> Sent: 29 June 2017 16:58 >> >> >> To: Nick Fisk <nick@xxxxxxxxxx> >> >> >> Cc: Ceph Users <ceph-users@xxxxxxxxxxxxxx> >> >> >> Subject: Re: Kernel mounted RBD's hanging >> >> >> >> >> >> On Thu, Jun 29, 2017 at 4:30 PM, Nick Fisk <nick@xxxxxxxxxx> wrote: >> >> >> > Hi All, >> >> >> > >> >> >> > Putting out a call for help to see if anyone can shed some light on >> this. >> >> >> > >> >> >> > Configuration: >> >> >> > Ceph cluster presenting RBD's->XFS->NFS->ESXi Running 10.2.7 on >> >> >> > the OSD's and 4.11 kernel on the NFS gateways in a pacemaker >> >> >> > cluster Both OSD's and clients are go into a pair of switches, >> >> >> > single L2 domain (no sign from pacemaker that there is network >> >> >> > connectivity >> >> >> > issues) >> >> >> > >> >> >> > Symptoms: >> >> >> > - All RBD's on a single client randomly hang for 30s to several >> >> >> > minutes, confirmed by pacemaker and ESXi hosts complaining >> >> >> >> >> >> Hi Nick, >> >> >> >> >> >> What is a "single client" here? >> >> > >> >> > I mean a node of the pacemaker cluster. So all RBD's on the same >> >> pacemaker node hang. >> >> > >> >> >> >> >> >> > - Cluster load is minimal when this happens most times >> >> >> >> >> >> Can you post gateway syslog and point at when this happened? >> >> >> Corresponding pacemaker excerpts won't hurt either. >> >> > >> >> > Jun 28 16:35:38 MS-CEPH-Proxy1 lrmd[2026]: warning: p_export_ceph- >> >> ds1_monitor_60000 process (PID 17754) timed out >> >> > Jun 28 16:35:43 MS-CEPH-Proxy1 lrmd[2026]: crit: p_export_ceph- >> >> ds1_monitor_60000 process (PID 17754) will not die! >> >> > Jun 28 16:43:51 MS-CEPH-Proxy1 lrmd[2026]: warning: >> >> > p_export_ceph-ds1_monitor_60000:17754 - timed out after 30000ms >> Jun >> >> 28 16:43:52 MS-CEPH-Proxy1 IPaddr(p_vip_ceph-ds1)[28482]: INFO: >> >> ifconfig >> >> ens224:0 down >> >> > Jun 28 16:43:52 MS-CEPH-Proxy1 lrmd[2026]: notice: p_vip_ceph- >> >> ds1_stop_0:28482:stderr [ SIOCDELRT: No such process ] >> >> > Jun 28 16:43:52 MS-CEPH-Proxy1 crmd[2029]: notice: Operation >> >> p_vip_ceph-ds1_stop_0: ok (node=MS-CEPH-Proxy1, call=471, rc=0, cib- >> >> update=318, confirmed=true) >> >> > Jun 28 16:43:52 MS-CEPH-Proxy1 exportfs(p_export_ceph-ds1)[28499]: >> >> INFO: Un-exporting file system ... >> >> > Jun 28 16:43:52 MS-CEPH-Proxy1 exportfs(p_export_ceph-ds1)[28499]: >> >> > INFO: unexporting 10.3.20.0/24:/mnt/Ceph-DS1 Jun 28 16:43:52 >> >> > MS-CEPH-Proxy1 exportfs(p_export_ceph-ds1)[28499]: INFO: Unlocked >> >> > NFS >> >> export /mnt/Ceph-DS1 Jun 28 16:43:52 MS-CEPH-Proxy1 >> >> exportfs(p_export_ceph-ds1)[28499]: INFO: Un-exported file system(s) >> >> > Jun 28 16:43:52 MS-CEPH-Proxy1 crmd[2029]: notice: Operation >> >> p_export_ceph-ds1_stop_0: ok (node=MS-CEPH-Proxy1, call=473, rc=0, >> >> cib- update=319, confirmed=true) >> >> > Jun 28 16:43:52 MS-CEPH-Proxy1 exportfs(p_export_ceph-ds1)[28549]: >> >> INFO: Exporting file system(s) ... >> >> > Jun 28 16:43:52 MS-CEPH-Proxy1 exportfs(p_export_ceph-ds1)[28549]: >> >> > INFO: exporting 10.3.20.0/24:/mnt/Ceph-DS1 Jun 28 16:43:52 MS-CEPH- >> >> Proxy1 exportfs(p_export_ceph-ds1)[28549]: INFO: directory >> >> /mnt/Ceph-DS1 exported >> >> > Jun 28 16:43:52 MS-CEPH-Proxy1 crmd[2029]: notice: Operation >> >> p_export_ceph-ds1_start_0: ok (node=MS-CEPH-Proxy1, call=474, rc=0, >> >> cib- update=320, confirmed=true) >> >> > >> >> > If I enable the read/write checks for the FS resource, they also >> >> > timeout at >> >> the same time. >> >> >> >> What about syslog that the above corresponds to? >> > >> > I get exactly the same "_monitor" timeout message. >> >> No "libceph: " or "rbd: " messages at all? No WARNs or hung tasks? >> >> > >> > Is there anything logging wise I can do with the kernel client to log when an >> IO is taking a long time. Sort of like the slow requests in Ceph, but client side? >> >> Nothing out of the box, as slow requests are usually not the client >> implementation's fault. Can you put together a script that would snapshot >> all files in /sys/kernel/debug/ceph/<cluster-fsid.client-id>/* >> on the gateways every second and rotate on an hourly basis? One of those >> files, osdc, lists in-flight requests. If that's empty when the timeouts occur >> then it's probably not krbd. > > I've managed to manually dump osdc when one of the hangs occurred: > > cat /sys/kernel/debug/ceph/d027d580-d69d-48f4-9d28-9b1650b57cce.client31526289/osdc > 4747768 osd75 17.7366b517 rb.0.4d983.238e1f29.0000000b72da set-alloc-hint,write > 4747770 osd75 17.c3a5d697 rbd_data.157b149238e1f29.0000000000000014 set-alloc-hint,write > 4747782 osd75 17.7366b517 rb.0.4d983.238e1f29.0000000b72da set-alloc-hint,write > 4747792 osd75 17.65154603 rb.0.4d983.238e1f29.000000022551 set-alloc-hint,write > 4747793 osd75 17.65154603 rb.0.4d983.238e1f29.000000022551 set-alloc-hint,write > 4747803 osd75 17.7366b517 rb.0.4d983.238e1f29.0000000b72da set-alloc-hint,write > 4747812 osd75 17.7366b517 rb.0.4d983.238e1f29.0000000b72da set-alloc-hint,write > 4747823 osd75 17.7366b517 rb.0.4d983.238e1f29.0000000b72da set-alloc-hint,write > 4747830 osd75 17.7366b517 rb.0.4d983.238e1f29.0000000b72da set-alloc-hint,write > 4747837 osd75 17.7366b517 rb.0.4d983.238e1f29.0000000b72da set-alloc-hint,write > 4747844 osd75 17.7366b517 rb.0.4d983.238e1f29.0000000b72da set-alloc-hint,write > > So from what you are saying, this is not a krbd problem as there are pending IO's in flight? No -- it's not empty. Do you happen to have more samples from that particular hang? If these same requests just sit there for minutes, that's definitely a ceph problem, whether krbd or cluster side. Thanks, Ilya _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com