RBD hard crash on kernel 3.10

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



We've been working on a storage repository for xenserver 6.5, which uses the 3.10 kernel (ug).  I got the xenserver guys to include the rbd and libceph kernel modules into the 6.5 release, so that's at least available.

Where things go bad is when we have many (>10 or so) VMs on one host, all using RBD clones for the storage mapped using the rbd kernel module.  The Xenserver crashes so badly that it doesn't even get a chance to kernel panic.  The whole box just hangs.

Has anyone else seen this sort of behavior?  

We have a lot of ways to try to work around this, but none of them are very pretty:

* move the code to user space, ditch the kernel driver:  The build tools for Xenserver are all CentOS5 based, and it is painful to get all of the deps built to get the ceph user space libs built.

* backport the ceph and rbd kernel modules to 3.10.  Has proven painful, as the block device code changed somewhere in the 3.14-3.16 timeframe.

* forward-port the xen kernel patches from 3.10 to a newer driver (3.18 preferred) and run that on xenserver.  Painful for the same reasons as above, but in the opposite direction.

Any and all suggestions are welcome.
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux