Re: osd become unusable, blocked by xfsaild (?) and load > 5000

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



The same thing happens to my setup with CentOS7.x + non-stock kernel (kernel-ml from elrepo).

I was not happy with IOPS I got out of the stock CentOS7.x so I did the kernel upgrade and crashes started to happen until some of the OSDs become non-bootable at all. The funny thing is that I was not able to downgrade back to stock since OSDs were crashing with 'cannot decode' errors. I am doing backup at the moment and OSDs crash from time to time due to the ceph watchdog despite the x20 timeouts.

 I believe the version of kernel-ml I have started with was 3.19.


On Tue, Dec 8, 2015 at 10:34 AM, Tom Christensen <pavera@xxxxxxxxx> wrote:
We didn't go forward to 4.2 as its a large production cluster, and we just needed the problem fixed.  We'll probably test out 4.2 in the next couple months, but this one slipped past us as it didn't occur in our test cluster until after we had upgraded production.  In our experience it takes about 2 weeks to start happening, but once it does its all hands on deck cause nodes are going to go down regularly.

All that being said, if/when we try 4.2 its going to need to run for 1-2 months rock solid in our test cluster before it gets to production.

On Tue, Dec 8, 2015 at 2:30 AM, Benedikt Fraunhofer <fraunhofer@xxxxxxxxxx> wrote:
Hi Tom,

> We have been seeing this same behavior on a cluster that has been perfectly
> happy until we upgraded to the ubuntu vivid 3.19 kernel.  We are in the

i can't recall when we gave 3.19 a shot but now that you say it... The
cluster was happy for >9 months with 3.16.
Did you try 4.2 or do you think the regression from 3.16 introduced
somewhere trough 3.19 is still in 4.2?

Thx!
   Benedikt

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux