> -----Original Message----- > From: Ilya Dryomov [mailto:idryomov@xxxxxxxxx] > Sent: 01 July 2017 13:19 > To: Nick Fisk <nick@xxxxxxxxxx> > Cc: Ceph Users <ceph-users@xxxxxxxxxxxxxx> > Subject: Re: Kernel mounted RBD's hanging > > On Sat, Jul 1, 2017 at 9:29 AM, Nick Fisk <nick@xxxxxxxxxx> wrote: > >> -----Original Message----- > >> From: Ilya Dryomov [mailto:idryomov@xxxxxxxxx] > >> Sent: 30 June 2017 14:06 > >> To: Nick Fisk <nick@xxxxxxxxxx> > >> Cc: Ceph Users <ceph-users@xxxxxxxxxxxxxx> > >> Subject: Re: Kernel mounted RBD's hanging > >> > >> On Fri, Jun 30, 2017 at 2:14 PM, Nick Fisk <nick@xxxxxxxxxx> wrote: > >> > > >> > > >> >> -----Original Message----- > >> >> From: Ilya Dryomov [mailto:idryomov@xxxxxxxxx] > >> >> Sent: 29 June 2017 18:54 > >> >> To: Nick Fisk <nick@xxxxxxxxxx> > >> >> Cc: Ceph Users <ceph-users@xxxxxxxxxxxxxx> > >> >> Subject: Re: Kernel mounted RBD's hanging > >> >> > >> >> On Thu, Jun 29, 2017 at 6:22 PM, Nick Fisk <nick@xxxxxxxxxx> wrote: > >> >> >> -----Original Message----- > >> >> >> From: Ilya Dryomov [mailto:idryomov@xxxxxxxxx] > >> >> >> Sent: 29 June 2017 16:58 > >> >> >> To: Nick Fisk <nick@xxxxxxxxxx> > >> >> >> Cc: Ceph Users <ceph-users@xxxxxxxxxxxxxx> > >> >> >> Subject: Re: Kernel mounted RBD's hanging > >> >> >> > >> >> >> On Thu, Jun 29, 2017 at 4:30 PM, Nick Fisk <nick@xxxxxxxxxx> wrote: > >> >> >> > Hi All, > >> >> >> > > >> >> >> > Putting out a call for help to see if anyone can shed some > >> >> >> > light on > >> this. > >> >> >> > > >> >> >> > Configuration: > >> >> >> > Ceph cluster presenting RBD's->XFS->NFS->ESXi Running 10.2.7 > >> >> >> > on the OSD's and 4.11 kernel on the NFS gateways in a > >> >> >> > pacemaker cluster Both OSD's and clients are go into a pair > >> >> >> > of switches, single L2 domain (no sign from pacemaker that > >> >> >> > there is network connectivity > >> >> >> > issues) > >> >> >> > > >> >> >> > Symptoms: > >> >> >> > - All RBD's on a single client randomly hang for 30s to > >> >> >> > several minutes, confirmed by pacemaker and ESXi hosts > >> >> >> > complaining > >> >> >> > >> >> >> Hi Nick, > >> >> >> > >> >> >> What is a "single client" here? > >> >> > > >> >> > I mean a node of the pacemaker cluster. So all RBD's on the same > >> >> pacemaker node hang. > >> >> > > >> >> >> > >> >> >> > - Cluster load is minimal when this happens most times > >> >> >> > >> >> >> Can you post gateway syslog and point at when this happened? > >> >> >> Corresponding pacemaker excerpts won't hurt either. > >> >> > > >> >> > Jun 28 16:35:38 MS-CEPH-Proxy1 lrmd[2026]: warning: > >> >> > p_export_ceph- > >> >> ds1_monitor_60000 process (PID 17754) timed out > >> >> > Jun 28 16:35:43 MS-CEPH-Proxy1 lrmd[2026]: crit: p_export_ceph- > >> >> ds1_monitor_60000 process (PID 17754) will not die! > >> >> > Jun 28 16:43:51 MS-CEPH-Proxy1 lrmd[2026]: warning: > >> >> > p_export_ceph-ds1_monitor_60000:17754 - timed out after 30000ms > >> Jun > >> >> 28 16:43:52 MS-CEPH-Proxy1 IPaddr(p_vip_ceph-ds1)[28482]: INFO: > >> >> ifconfig > >> >> ens224:0 down > >> >> > Jun 28 16:43:52 MS-CEPH-Proxy1 lrmd[2026]: notice: p_vip_ceph- > >> >> ds1_stop_0:28482:stderr [ SIOCDELRT: No such process ] > >> >> > Jun 28 16:43:52 MS-CEPH-Proxy1 crmd[2029]: notice: Operation > >> >> p_vip_ceph-ds1_stop_0: ok (node=MS-CEPH-Proxy1, call=471, rc=0, > >> >> cib- update=318, confirmed=true) > >> >> > Jun 28 16:43:52 MS-CEPH-Proxy1 exportfs(p_export_ceph-ds1)[28499]: > >> >> INFO: Un-exporting file system ... > >> >> > Jun 28 16:43:52 MS-CEPH-Proxy1 exportfs(p_export_ceph-ds1)[28499]: > >> >> > INFO: unexporting 10.3.20.0/24:/mnt/Ceph-DS1 Jun 28 16:43:52 > >> >> > MS-CEPH-Proxy1 exportfs(p_export_ceph-ds1)[28499]: INFO: > >> >> > Unlocked NFS > >> >> export /mnt/Ceph-DS1 Jun 28 16:43:52 MS-CEPH-Proxy1 > >> >> exportfs(p_export_ceph-ds1)[28499]: INFO: Un-exported file > >> >> system(s) > >> >> > Jun 28 16:43:52 MS-CEPH-Proxy1 crmd[2029]: notice: Operation > >> >> p_export_ceph-ds1_stop_0: ok (node=MS-CEPH-Proxy1, call=473, rc=0, > >> >> cib- update=319, confirmed=true) > >> >> > Jun 28 16:43:52 MS-CEPH-Proxy1 exportfs(p_export_ceph-ds1)[28549]: > >> >> INFO: Exporting file system(s) ... > >> >> > Jun 28 16:43:52 MS-CEPH-Proxy1 exportfs(p_export_ceph-ds1)[28549]: > >> >> > INFO: exporting 10.3.20.0/24:/mnt/Ceph-DS1 Jun 28 16:43:52 > >> >> > MS-CEPH- > >> >> Proxy1 exportfs(p_export_ceph-ds1)[28549]: INFO: directory > >> >> /mnt/Ceph-DS1 exported > >> >> > Jun 28 16:43:52 MS-CEPH-Proxy1 crmd[2029]: notice: Operation > >> >> p_export_ceph-ds1_start_0: ok (node=MS-CEPH-Proxy1, call=474, > >> >> rc=0, > >> >> cib- update=320, confirmed=true) > >> >> > > >> >> > If I enable the read/write checks for the FS resource, they also > >> >> > timeout at > >> >> the same time. > >> >> > >> >> What about syslog that the above corresponds to? > >> > > >> > I get exactly the same "_monitor" timeout message. > >> > >> No "libceph: " or "rbd: " messages at all? No WARNs or hung tasks? > >> > >> > > >> > Is there anything logging wise I can do with the kernel client to > >> > log when an > >> IO is taking a long time. Sort of like the slow requests in Ceph, but client side? > >> > >> Nothing out of the box, as slow requests are usually not the client > >> implementation's fault. Can you put together a script that would > >> snapshot all files in > >> /sys/kernel/debug/ceph/<cluster-fsid.client-id>/* > >> on the gateways every second and rotate on an hourly basis? One of > >> those files, osdc, lists in-flight requests. If that's empty when > >> the timeouts occur then it's probably not krbd. > > > > I've managed to manually dump osdc when one of the hangs occurred: > > > > cat /sys/kernel/debug/ceph/d027d580-d69d-48f4-9d28-9b1650b57cce.client31526289/osdc > > 4747768 osd75 17.7366b517 rb.0.4d983.238e1f29.0000000b72da set-alloc-hint,write > > 4747770 osd75 17.c3a5d697 rbd_data.157b149238e1f29.0000000000000014 set-alloc-hint,write > > 4747782 osd75 17.7366b517 rb.0.4d983.238e1f29.0000000b72da set-alloc-hint,write > > 4747792 osd75 17.65154603 rb.0.4d983.238e1f29.000000022551 set-alloc-hint,write > > 4747793 osd75 17.65154603 rb.0.4d983.238e1f29.000000022551 set-alloc-hint,write > > 4747803 osd75 17.7366b517 rb.0.4d983.238e1f29.0000000b72da set-alloc-hint,write > > 4747812 osd75 17.7366b517 rb.0.4d983.238e1f29.0000000b72da set-alloc-hint,write > > 4747823 osd75 17.7366b517 rb.0.4d983.238e1f29.0000000b72da set-alloc-hint,write > > 4747830 osd75 17.7366b517 rb.0.4d983.238e1f29.0000000b72da set-alloc-hint,write > > 4747837 osd75 17.7366b517 rb.0.4d983.238e1f29.0000000b72da set-alloc-hint,write > > 4747844 osd75 17.7366b517 rb.0.4d983.238e1f29.0000000b72da set-alloc-hint,write > > > > So from what you are saying, this is not a krbd problem as there are pending IO's in flight? > > No -- it's not empty. Do you happen to have more samples from that particular hang? If these same requests just sit there for minutes, > that's definitely a ceph problem, whether krbd or cluster side. Managed to catch another one, osd.75 again, not sure if that is an indication of anything or just a co-incidence. osd.75 is one of 8 OSD's in a cache tier, so all IO will be funnelled through them. cat /sys/kernel/debug/ceph/d027d580-d69d-48f4-9d28-9b1650b57cce.client31443905/osdc REQUESTS 13 homeless 0 130947221 osd75 17.dbb45597 [75,73,25]/75 [75,73,25]/75 rbd_data.158f204238e1f29.0000000000080171 0x400024 1 0'0 set-alloc-hint,write 130947226 osd75 17.4f47f0c3 [75,14,72]/75 [75,14,72]/75 rbd_data.1555406238e1f29.000000000007c8a9 0x400024 1 0'0 set-alloc-hint,write 130947231 osd75 17.a184a1cc [75,72,3]/75 [75,72,3]/75 rbd_data.15d8670238e1f29.0000000000064054 0x400024 1 0'0 set-alloc-hint,write 130947274 osd75 17.4d83ed0c [75,72,3]/75 [75,72,3]/75 rbd_data.1555406238e1f29.000000000007ccc1 0x400024 1 0'0 set-alloc-hint,write 130947349 osd75 17.dbb45597 [75,73,25]/75 [75,73,25]/75 rbd_data.158f204238e1f29.0000000000080171 0x400024 1 0'0 set-alloc-hint,write 130947421 osd75 17.32207383 [75,14,72]/75 [75,14,72]/75 rbd_data.15d8670238e1f29.0000000000000000 0x400024 1 0'0 set-alloc-hint,write 130947472 osd75 17.dbb45597 [75,73,25]/75 [75,73,25]/75 rbd_data.158f204238e1f29.0000000000080171 0x400024 1 0'0 set-alloc-hint,write 130947474 osd75 17.32207383 [75,14,72]/75 [75,14,72]/75 rbd_data.15d8670238e1f29.0000000000000000 0x400024 1 0'0 set-alloc-hint,write 130947689 osd75 17.dbb45597 [75,73,25]/75 [75,73,25]/75 rbd_data.158f204238e1f29.0000000000080171 0x400024 1 0'0 set-alloc-hint,write 130947740 osd75 17.dbb45597 [75,73,25]/75 [75,73,25]/75 rbd_data.158f204238e1f29.0000000000080171 0x400024 1 0'0 set-alloc-hint,write 130947783 osd75 17.dbb45597 [75,73,25]/75 [75,73,25]/75 rbd_data.158f204238e1f29.0000000000080171 0x400024 1 0'0 set-alloc-hint,write 130947826 osd75 17.dbb45597 [75,73,25]/75 [75,73,25]/75 rbd_data.158f204238e1f29.0000000000080171 0x400024 1 0'0 set-alloc-hint,write 130947868 osd75 17.dbb45597 [75,73,25]/75 [75,73,25]/75 rbd_data.158f204238e1f29.0000000000080171 0x400024 1 0'0 set-alloc-hint,write LINGER REQUESTS 18446462598732840990 osd74 17.145baa0f [74,72,14]/74 [74,72,14]/74 rbd_header.158f204238e1f29 0x20 0 WC/0 18446462598732840991 osd74 17.7b4e2a06 [74,72,25]/74 [74,72,25]/74 rbd_header.1555406238e1f29 0x20 0 WC/0 18446462598732840992 osd74 17.eea94d58 [74,73,25]/74 [74,73,25]/74 rbd_header.15d8670238e1f29 0x20 0 WC/0 Also found this in the log of osd.75 at the same time, but the client IP is not the same as the node which experienced the hang. 2017-07-05 21:11:18.653607 7f788493d700 0 bad crc in data 849850452 != exp 2621724690 2017-07-05 21:11:18.655004 7f7896fa6700 0 -- 10.3.31.15:6800/31387 submit_message osd_op_reply(125450617 rbd_data.15a5e20238e1f29.000000000023dede [set-alloc-hint object_size 4194304 wr ite_size 4194304,write 1671168~4096] v99810'110406048 uv113132316 ondisk = 0) v7 remote, 10.3.31.43:0/914663476, failed lossy con, dropping message 0x55950d437a00 2017-07-05 21:12:21.450978 7f788493d700 0 bad crc in data 1050156366 != exp 653845871 2017-07-05 21:12:34.515905 7f788493d700 0 bad crc in data 3801395556 != exp 180104028 > > Thanks, > > Ilya _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com