Re: osd become unusable, blocked by xfsaild (?) and load > 5000

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Looks like the bug with the kernel using ceph and XFS was fixed, I haven't tested it yet but just wanted to give an update.

https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1527062

On Tue, Dec 8, 2015 at 8:05 AM Scottix <scottix@xxxxxxxxx> wrote:
I can confirm it seems to be kernels greater than 3.16, we had this problem where servers would lock up and had to perform restarts on a weekly basis.
We downgraded to 3.16, since then we have not had to do any restarts.

I did find this thread in the XFS forums and I am not sure if has been fixed or not
http://oss.sgi.com/archives/xfs/2015-07/msg00034.html


On Tue, Dec 8, 2015 at 2:06 AM Tom Christensen <pavera@xxxxxxxxx> wrote:
We run deep scrubs via cron with a script so we know when deep scrubs are happening, and we've seen nodes fail both during deep scrubbing and while no deep scrubs are occurring so I'm pretty sure its not related.


On Tue, Dec 8, 2015 at 2:42 AM, Benedikt Fraunhofer <fraunhofer@xxxxxxxxxx> wrote:
Hi Tom,

2015-12-08 10:34 GMT+01:00 Tom Christensen <pavera@xxxxxxxxx>:

> We didn't go forward to 4.2 as its a large production cluster, and we just
> needed the problem fixed.  We'll probably test out 4.2 in the next couple

unfortunately we don't have the luxury of a test cluster.
and to add to that, we couldnt simulate the load, altough it does not
seem to be load related.
Did you try running with nodeep-scrub as a short-term workaround?

I'll give ~30% of the nodes 4.2 and see how it goes.

> In our experience it takes about 2 weeks to start happening

we're well below that. Somewhat between 1 and 4 days.
And yes, once one goes south, it affects the rest of the cluster.

Thx!

 Benedikt

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux