IRQ balancing, distribution

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



>> but another issue is the OSD processes: do you pin those as well? and
>> how much data do they actually handle. to checksum, the OSD process
>> needs all data, so that can also cause a lot of NUMA traffic, esp if
>> they are not pinned.
>>
> That's why all my (production) storage nodes have only a single 6 or 8
> core CPU. Unfortunately that also limits the amount of RAM in there, 16GB
> modules have just recently become an economically viable alternative to
> 8GB ones.
>
> Thus I don't pin OSD processes, given that on my 8 core nodes with 8 OSDs
> and 4 journal SSDs I can make Ceph eat babies and nearly all CPU (not
> IOwait!) resources with the right (or is that wrong) tests, namely 4K
> FIOs.
>
> The linux scheduler usually is quite decent in keeping processes where the
> action is, thus you see for example a clear preference of DRBD or KVM vnet
> processes to be "near" or on the CPU(s) where the IRQs are.
the scheduler has improved recently, but i don't know since what version 
(certainly not backported to RHEL6 kernel).

pinning the OSDs might actually be a bad idea, unless the page cache is 
flushed before each osd restart. kernel VM has this nice "feature" where 
allocating memory in a NUMA domain does not trigger freeing of cache 
memory in the domain, but it will first try to allocate memory on 
another NUMA domain. although typically the VM cache will be maxed out 
on OSD boxes, i'm not sure the cache clearing itself is NUMA aware, so 
who knows where the memory is located when it's allocated.


stijn


[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux