Re: Uneven CPU usage on OSD nodes

Gregory Farnum <greg@xxxxxxxxxxx> · Mon, 23 Mar 2015 07:04:51 -0700

On Mon, Mar 23, 2015 at 4:31 AM, fred@xxxxxxxxxx <fred@xxxxxxxxxx> wrote:
> Hi Somnath,
>
> Thank you, please find my answers below
>
> Somnath Roy <Somnath.Roy@xxxxxxxxxxx> a écrit le 22/03/15 18:16 :
>
> Hi Frederick,
>
> Need some information here.
>
>
>
> 1. Just to clarify, you are saying it is happening g in 0.87.1 and not in
> Firefly ?
>
> That's a possibility, others running similar hardware (and possibly OS, I
> can ask) confirm they dont have such visible comportment on Firefly.
> I'd need to install Firefly on our hosts to be sure.
> We run on RHEL.
>
>
>
> 2. Is it happening after some hours of run or just right away ?
>
> It's happening on freshly installed hosts and goes on.
>
>
>
> 3. Please provide ‘perf top’ output of all the OSD nodes.
>
> Here they are :
> http://www.4shared.com/photo/S9tvbNKEce/UnevenLoad3-perf.html
> http://www.4shared.com/photo/OHfiAtXKba/UnevenLoad3-top.html
>
> The left-hand 'high-cpu' nodes have tmalloc calls able to explain the cpu
> difference. We don't see them on 'low-cpu' nodes :
>
> 12,15%  libtcmalloc.so.4.1.2      [.]
> tcmalloc::CentralFreeList::FetchFromSpans

Huh. The tcmalloc (memory allocator) workload should be roughly the
same across all nodes, especially if they have equivalent
distributions of PGs and primariness as you describe. Are you sure
this is a persistent CPU imbalance or are they oscillating? Are there
other processes on some of the nodes which could be requiring memory
from the system?

Either you've found a new bug in our memory allocator or something
else is going on in the system to make it behave differently across
your nodes.
-Greg
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com