RE: Performance variation across RBD clients on different pools in all SSD setup - tcmalloc issue

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Sage,

We did test with latest version of tcmalloc as well. It exhibited the same behavior.

Viju


-----Original Message-----
From: ceph-devel-owner@xxxxxxxxxxxxxxx [mailto:ceph-devel-owner@xxxxxxxxxxxxxxx] On Behalf Of Sage Weil
Sent: Wednesday, December 03, 2014 9:30 PM
To: Chaitanya Huilgol
Cc: ceph-devel@xxxxxxxxxxxxxxx
Subject: Re: Performance variation across RBD clients on different pools in all SSD setup - tcmalloc issue

On Wed, 3 Dec 2014, Chaitanya Huilgol wrote:
> (2)    TCmalloc - Client -2 (note significant increase in TCmalloc internal free to central list code paths)
>
> 14.75%  libtcmalloc.so.4.1.2     [.] tcmalloc::CentralFreeList::FetchFromSpans()
>   7.46%  libtcmalloc.so.4.1.2     [.] tcmalloc::ThreadCache::ReleaseToCentralCache(tcmalloc::ThreadCache::FreeList*, unsigned long, int)
>   6.71%  libtcmalloc.so.4.1.2     [.] tcmalloc::CentralFreeList::ReleaseToSpans(void*)
>   1.68%  libtcmalloc.so.4.1.2     [.] operator new(unsigned long)
>   1.57%  ceph-osd                 [.] crush_hash32_3

Yikes!

> IMHO, we should probably look at the following in general for better
> performance with less variation
>
> - Add jemalloc option for ceph builds

Definitely.

Several years ago we saw serious heap fragmetnation issues with glibc.  I suspect newer versions are less problematics.  It may also be that newer version of tcmalloc behave better (not sure if we're linking against the latest version?).  In any case, we should have build support for all options.  We'll need to be careful when making a change, though.  The best choice may also vary on a per-distro basis.

> - Look at ways to evenly distribute PGs across the shards - with
> larger number of shards some shards do not get exercised at all while
> some are overloaded

Ok

> - Look at decreasing heap activity in the I/O path (index Manager,
> Hash Index, LFN index etc.)

Yes.  Unfortunately I think this is a long tail... lots of small changes needed before we'll see much impact.

sage
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at  http://vger.kernel.org/majordomo-info.html

________________________________

PLEASE NOTE: The information contained in this electronic mail message is intended only for the use of the designated recipient(s) named above. If the reader of this message is not the intended recipient, you are hereby notified that you have received this message in error and that any review, dissemination, distribution, or copying of this message is strictly prohibited. If you have received this communication in error, please notify the sender by telephone or e-mail (as shown above) immediately and destroy any and all copies of this message in your possession (whether hard copies or electronically stored copies).

--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html




[Index of Archives]     [CEPH Users]     [Ceph Large]     [Information on CEPH]     [Linux BTRFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]
  Powered by Linux