Re: osd become unusable, blocked by xfsaild (?) and load > 5000

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



> On 08 Dec 2015, at 08:57, Benedikt Fraunhofer <fraunhofer@xxxxxxxxxx> wrote:
> 
> Hi Jan,
> 
>> Doesn't look near the limit currently (but I suppose you rebooted it in the meantime?).
> 
> the box this numbers came from has an uptime of 13 days
> so it's one of the boxes that did survive yesterdays half-cluster-wide-reboot.
> 

So this box had no issues? Keep an eye on the number of threadas, but maybe others will have a better idea, this is just where I'd start. I have seen close to a milion threads from OSDs on my boxes, not sure what the number are now.

>> Did iostat say anything about the drives? (btw dm-1 and dm-6 are what? Is that your data drives?) - were they overloaded really?
> 
> no they didn't have any load and or iops.
> Basically the whole box had nothing to do.
> 
> If I understand the load correctly, this just reports threads
> that are ready and willing to work but - in this case -
> don't get any data to work with.

Different unixes calculate this differently :-) By itself "load" is meaningless.
It should be something like an average number of processes that want to run at any given time but can't (because they are waiting for whatever they need - disks, CPU, blocking sockets...).

Jan


> 
> Thx
> 
> Benedikt
> 
> 
> 2015-12-08 8:44 GMT+01:00 Jan Schermer <jan@xxxxxxxxxxx>:
>> 
>> Jan
>> 
>> 
>>> On 08 Dec 2015, at 08:41, Benedikt Fraunhofer <fraunhofer@xxxxxxxxxx> wrote:
>>> 
>>> Hi Jan,
>>> 
>>> we had 65k for pid_max, which made
>>> kernel.threads-max = 1030520.
>>> or
>>> kernel.threads-max = 256832
>>> (looks like it depends on the number of cpus?)
>>> 
>>> currently we've
>>> 
>>> root@ceph1-store209:~# sysctl -a | grep -e thread -e pid
>>> kernel.cad_pid = 1
>>> kernel.core_uses_pid = 0
>>> kernel.ns_last_pid = 60298
>>> kernel.pid_max = 65535
>>> kernel.threads-max = 256832
>>> vm.nr_pdflush_threads = 0
>>> root@ceph1-store209:~# ps axH |wc -l
>>> 17548
>>> 
>>> we'll see how it behaves once puppet has come by and adjusted it.
>>> 
>>> Thx!
>>> 
>>> Benedikt
>> 

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com



[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux