Re: Kernel mounted RBD's hanging

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, Jul 7, 2017 at 12:10 PM, Nick Fisk <nick@xxxxxxxxxx> wrote:
> Managed to catch another one, osd.75 again, not sure if that is an indication of anything or just a co-incidence. osd.75 is one of 8 OSD's in a cache tier, so all IO will be funnelled through them.
>
>
> cat /sys/kernel/debug/ceph/d027d580-d69d-48f4-9d28-9b1650b57cce.client31443905/osdc
> REQUESTS 13 homeless 0
> 130947221       osd75   17.dbb45597     [75,73,25]/75   [75,73,25]/75   rbd_data.158f204238e1f29.0000000000080171       0x400024        1       0'0     set-alloc-hint,write
> 130947226       osd75   17.4f47f0c3     [75,14,72]/75   [75,14,72]/75   rbd_data.1555406238e1f29.000000000007c8a9       0x400024        1       0'0     set-alloc-hint,write
> 130947231       osd75   17.a184a1cc     [75,72,3]/75    [75,72,3]/75    rbd_data.15d8670238e1f29.0000000000064054       0x400024        1       0'0     set-alloc-hint,write
> 130947274       osd75   17.4d83ed0c     [75,72,3]/75    [75,72,3]/75    rbd_data.1555406238e1f29.000000000007ccc1       0x400024        1       0'0     set-alloc-hint,write
> 130947349       osd75   17.dbb45597     [75,73,25]/75   [75,73,25]/75   rbd_data.158f204238e1f29.0000000000080171       0x400024        1       0'0     set-alloc-hint,write
> 130947421       osd75   17.32207383     [75,14,72]/75   [75,14,72]/75   rbd_data.15d8670238e1f29.0000000000000000       0x400024        1       0'0     set-alloc-hint,write
> 130947472       osd75   17.dbb45597     [75,73,25]/75   [75,73,25]/75   rbd_data.158f204238e1f29.0000000000080171       0x400024        1       0'0     set-alloc-hint,write
> 130947474       osd75   17.32207383     [75,14,72]/75   [75,14,72]/75   rbd_data.15d8670238e1f29.0000000000000000       0x400024        1       0'0     set-alloc-hint,write
> 130947689       osd75   17.dbb45597     [75,73,25]/75   [75,73,25]/75   rbd_data.158f204238e1f29.0000000000080171       0x400024        1       0'0     set-alloc-hint,write
> 130947740       osd75   17.dbb45597     [75,73,25]/75   [75,73,25]/75   rbd_data.158f204238e1f29.0000000000080171       0x400024        1       0'0     set-alloc-hint,write
> 130947783       osd75   17.dbb45597     [75,73,25]/75   [75,73,25]/75   rbd_data.158f204238e1f29.0000000000080171       0x400024        1       0'0     set-alloc-hint,write
> 130947826       osd75   17.dbb45597     [75,73,25]/75   [75,73,25]/75   rbd_data.158f204238e1f29.0000000000080171       0x400024        1       0'0     set-alloc-hint,write
> 130947868       osd75   17.dbb45597     [75,73,25]/75   [75,73,25]/75   rbd_data.158f204238e1f29.0000000000080171       0x400024        1       0'0     set-alloc-hint,write
> LINGER REQUESTS
> 18446462598732840990    osd74   17.145baa0f     [74,72,14]/74   [74,72,14]/74   rbd_header.158f204238e1f29      0x20    0       WC/0
> 18446462598732840991    osd74   17.7b4e2a06     [74,72,25]/74   [74,72,25]/74   rbd_header.1555406238e1f29      0x20    0       WC/0
> 18446462598732840992    osd74   17.eea94d58     [74,73,25]/74   [74,73,25]/74   rbd_header.15d8670238e1f29      0x20    0       WC/0
>
> Also found this in the log of osd.75 at the same time, but the client IP is not the same as the node which experienced the hang.

Can you bump debug_ms and debug_osd to 30 on osd75?  I doubt it's an
issue with that particular OSD, but if it goes down the same way again,
I'd have something to look at.  Make sure logrotate is configured and
working before doing that though... ;)

Thanks,

                Ilya
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com



[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux