Re: Kernel RBD hang on OSD Failure

Ilya Dryomov <idryomov@xxxxxxxxx> · Sat, 12 Dec 2015 20:10:31 +0100

On Sat, Dec 12, 2015 at 6:37 PM, Tom Christensen <pavera@xxxxxxxxx> wrote:
> We had a kernel map get hung up again last night/this morning.  The rbd is
> mapped but unresponsive, if I try to unmap it I get the following error:
> rbd: sysfs write failed
> rbd: unmap failed: (16) Device or resource busy
>
> Now that this has happened attempting to map another RBD fails, using lsblk
> fails as well, both of these tasks just hang forever.
>
> We have 1480 OSDs in the cluster so posting the osdmap seems excessive,
> however here is the beginning (didn't change in 5 runs):
> root@wrk-slc-01-02:~# cat
> /sys/kernel/debug/ceph/f3b7f409-e061-4e39-b4d0-ae380e29ae7e.client55440310/osdmap
> epoch 1284256
> flags
> pool 0 pg_num 2048 (2047) read_tier -1 write_tier -1
> pool 1 pg_num 512 (511) read_tier -1 write_tier -1
> pool 3 pg_num 2048 (2047) read_tier -1 write_tier -1
> pool 4 pg_num 512 (511) read_tier -1 write_tier -1
> pool 5 pg_num 32768 (32767) read_tier -1 write_tier -1
>
> Here is osdc output, it is not changed after 5 runs:
>
> root@wrk-slc-01-02:~# cat
> /sys/kernel/debug/ceph/f3b7f409-e061-4e39-b4d0-ae380e29ae7e.client55440310/osdc
> 93835   osd1206 5.6841959c      rbd_data.34df3ac703ced61.0000000000001dff
> read
> 9065810 osd1382 5.a50fa0ea      rbd_header.34df3ac703ced61
> 474103'5506530325561344 watch
> root@wrk-slc-01-02:~# cat
> /sys/kernel/debug/ceph/f3b7f409-e061-4e39-b4d0-ae380e29ae7e.client55440310/osdc
> 93835   osd1206 5.6841959c      rbd_data.34df3ac703ced61.0000000000001dff
> read
> 9067286 osd1382 5.a50fa0ea      rbd_header.34df3ac703ced61
> 474103'5506530325561344 watch
> root@wrk-slc-01-02:~# cat
> /sys/kernel/debug/ceph/f3b7f409-e061-4e39-b4d0-ae380e29ae7e.client55440310/osdc
> 93835   osd1206 5.6841959c      rbd_data.34df3ac703ced61.0000000000001dff
> read
> 9067831 osd1382 5.a50fa0ea      rbd_header.34df3ac703ced61
> 474103'5506530325561344 watch
> root@wrk-slc-01-02:~# ls /dev/rbd/rbd
> none  volume-daac5f12-e39b-4d64-a4fa-86c810aeb72d
> volume-daac5f12-e39b-4d64-a4fa-86c810aeb72d-part1
> root@wrk-slc-01-02:~# rbd info volume-daac5f12-e39b-4d64-a4fa-86c810aeb72d
> rbd image 'volume-daac5f12-e39b-4d64-a4fa-86c810aeb72d':
>         size 61439 MB in 7680 objects
>         order 23 (8192 kB objects)
>         block_name_prefix: rbd_data.34df3ac703ced61
>         format: 2
>         features: layering
>         flags:
>         parent:
> rbd/volume-93d9a102-260e-4500-b87d-9696c7fc2b67@snapshot-9ba998b6-ca57-40dd-8895-265023132e99
>         overlap: 61439 MB
>
> ceph status indicates the current osdmap epoch
> osdmap e1284866: 1480 osds: 1480 up, 1480 in
> pgmap v10231386: 37888 pgs, 5 pools, 745 TB data, 293 Mobjects
>
> root@wrk-slc-01-02:~# uname -r
> 3.19.0-25-generic
>
> So, the kernel driver is some 600 epochs behind current.  This does seem to
> be load related as we've been running 4 different kernels in our clients in
> our test environment and have not been able to recreate it there in a little
> over a week, however our production environment has had 2 of these hangs in
> the last 4 days.  Unfortunately I wasn't able to grab data from the first
> one.

If you haven't already nuked it, what's the output of:

$ ceph osd map <pool name of pool with id 5>
rbd_data.34df3ac703ced61.0000000000001dff
$ ceph osd map <pool name of pool with id 5> rbd_header.34df3ac703ced61

$ ceph daemon osd.1206 ops
$ ceph daemon osd.1206 objecter_requests
$ ceph daemon osd.1206 dump_ops_in_flight
$ ceph daemon osd.1206 dump_historic_ops

and repeat for osd.1382.

Thanks,

                Ilya
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com