Re: Kernel RBD hang on OSD Failure

Ilya Dryomov <idryomov@xxxxxxxxx> · Tue, 8 Dec 2015 11:59:57 +0100

On Mon, Dec 7, 2015 at 9:56 PM, Matt Conner <matt.conner@xxxxxxxxxxxxxx> wrote:
> Hi,
>
> We have a Ceph cluster in which we have been having issues with RBD
> clients hanging when an OSD failure occurs. We are using a NAS gateway
> server which maps RBD images to filesystems and serves the filesystems
> out via NFS. The gateway server has close to 180 NFS clients and
> almost every time even 1 OSD goes down during heavy load, the NFS
> exports lock up and the clients are unable to access the NAS share via
> NFS. When the OSD fails, Ceph recovers without issue, but the gateway
> kernel RBD module appears to get stuck waiting on the now failed OSD.
> Note that this works correctly when under lighter loads.
>
> From what we have been able to determine, the NFS server daemon hangs
> waiting for I/O from the OSD that went out and never recovers.
> Similarly, attempting to access files from the exported FS locally on
> the gateway server will result in a similar hang. We also noticed that
> Ceph health details will continue to report blocked I/O on the now
> down OSD until either the OSD is recovered or the gateway server is
> rebooted.  Based on a few kernel logs from NFS and PVS, we were able
> to trace the problem to the RBD kernel module.
>
> Unfortunately, the only way we have been able to recover our gateway
> is by hard rebooting the server.
>
> Has anyone else encountered this issue and/or have a possible solution?
> Are there suggestions for getting more detailed debugging information
> from the RBD kernel module?

Dumping osdc, osdmap and ceph status when it gets stuck would be
a start:

# cat /sys/kernel/debug/ceph/*/osdmap
# cat /sys/kernel/debug/ceph/*/osdc
$ ceph status

Thanks,

                Ilya
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html