Re: RBD hard crash on kernel 3.10

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




On Wed, Apr 8, 2015 at 9:23 PM Christian Balzer <chibi@xxxxxxx> wrote:
On Wed, 08 Apr 2015 14:25:36 +0000 Shawn Edwards wrote:

> We've been working on a storage repository for xenserver 6.5, which uses
> the 3.10 kernel (ug).  I got the xenserver guys to include the rbd and
> libceph kernel modules into the 6.5 release, so that's at least
> available.
>
Woah, hold on for a moment here.

AFAIK no version of XenServer supports Ceph/RBD, how are you planning to
integrate that?

Another group in my company uses XenServer and I was hoping to supply them
with cheap(er) storage than what they have now, but the local Citrix
reseller (we have a huge spendy license) wasn't helpful at all when it
came to get a feature request for upcoming versions into the pipeline.

And anything with a NFS or iSCSI head on top of Ceph is another can of
worms in terms of complexity and reliability, never mind that cost and
performance will also be impacted by such a kludge.

So is this some work in progress then?

Yes, this is work being done at my employer's.  We've been looking at this as a possibility for a while, and finally had the resources to dedicate to it.
 

> Where things go bad is when we have many (>10 or so) VMs on one host, all
> using RBD clones for the storage mapped using the rbd kernel module.  The
> Xenserver crashes so badly that it doesn't even get a chance to kernel
> panic.  The whole box just hangs.
>
> Has anyone else seen this sort of behavior?
>
I think that's very much expected with that kernel, you'll probably want
3.18 when using the kernel module, as that is the first to support TRIM as
well.
Also the host you're mapping things to isn't also a Ceph OSD node, right?


Correct.  The Xenservers and Ceph OSD are completely separate machines.
 
> We have a lot of ways to try to work around this, but none of them are
> very pretty:
>
> * move the code to user space, ditch the kernel driver:  The build tools
> for Xenserver are all CentOS5 based, and it is painful to get all of the
> deps built to get the ceph user space libs built.
>
I feel your pain, however experience tells that the user space is better
and most importantly faster maintained when it comes to vital features.

This is the option I'd go for, everything else being equal.


Yeah, agreed.  This is the best option.  Grumble.  The recent source for Ceph really doesn't like to play well with the older build tools, and the dependency list is staggering.
 
> * backport the ceph and rbd kernel modules to 3.10.  Has proven painful,
> as the block device code changed somewhere in the 3.14-3.16 timeframe.
>
Madness seems to lie down that path.


Agreed.
 
> * forward-port the xen kernel patches from 3.10 to a newer driver (3.18
> preferred) and run that on xenserver.  Painful for the same reasons as
> above, but in the opposite direction.
>
Probably the better of the 2 choices, as you gain many other improvements
as well. Including support of newer hardware. ^o^


Agreed as well.  At this point, it may be the 'best' option, as it would require the least rewrite of our most current bits.  Although it would make maintaining Xen painful, since we'd need to be extra careful with vendor patches.
 
Regards,

Christian
--
Christian Balzer        Network/Systems Engineer
chibi@xxxxxxx           Global OnLine Japan/Fusion Communications
http://www.gol.com/
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux