Re: Switching from tcmalloc

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Mark,
>>It would be really interesting if you could give jemalloc a try. 

I have done a lot of benchs with osd and jemalloc (--with-jemalloc), libjemalloc1 (= 3.5.1-2) on ubuntu trusty.

mainly with 4k randread and randwrite,
and I don't have seen any problem/hang/bug.

The speed was around same than tcmalloc, maybe a little bit slower, but it's marginal.

for read, I was around 250k iops by osd with jemalloc vs 260k iops with tcmalloc.

----- Mail original -----
De: "Mark Nelson" <mnelson@xxxxxxxxxx>
À: "ceph-users" <ceph-users@xxxxxxxxxxxxxx>
Envoyé: Jeudi 25 Juin 2015 18:25:26
Objet: Re:  Switching from tcmalloc

It would be really interesting if you could give jemalloc a try. 
Originally tcmalloc was used to get around some serious memory 
fragmentation issues in the OSD. You can read the original bug tracker 
entry from 5 years ago here: 

http://tracker.ceph.com/issues/138 

It's definitely possible that glibc malloc has improved since back then. 
I think jemalloc is definitely worth considering. It appears to be a 
little slower than tcmalloc when tcmalloc is working well, but far more 
consistent and likely faster than glibc. 

Mark 

On 06/24/2015 12:59 PM, Jan Schermer wrote: 
> We already had the migratepages in place before we disabled tcmalloc. It 
> didn’t do much. 
> 
> Disabling tcmalloc made immediate difference but there were still spikes 
> and the latency wasn’t that great. (CPU usage was) 
> Migrating memory helped a lot after that - it didn’t help (at least not 
> the visibly on graphs) when tcmalloc was used - it’s overhead was so 
> large NUMA didn’t matter at all. 
> But we are running Dumpling, so it is possible other bottlenecks that 
> were resolved in later (Giant) releases would once again overshadow the 
> gain we got from disabling tcmalloc or there would be regression from 
> disabling it. 
> … or our setup/workload is somehow completely different from what 
> somebody else has? 
> 
> Jan 
> 
>> On 24 Jun 2015, at 19:41, Robert LeBlanc <robert@xxxxxxxxxxxxx 
>> <mailto:robert@xxxxxxxxxxxxx>> wrote: 
>> 
>> -----BEGIN PGP SIGNED MESSAGE----- 
>> Hash: SHA256 
>> 
>> From what I understand, you probably got most of your reduction from co-locating your memory to the right NUMA nodes. tcmalloc/jemalloc should be much higher in performance because of how they hold memory in thread pools (less locking to allocate memory) and they try much harder to reuse dirty free pages so memory stays within the thread again reducing locking for memory allocations. 
>> 
>> I would do some more testing along with what Ben Hines mentioned about overall client performance. 
>> 
>> - ---------------- 
>> Robert LeBlanc 
>> GPG Fingerprint 79A2 9CA4 6CC4 45DD A904 C70E E654 3BB2 FA62 B9F1 
>> 
>> On Wed, Jun 24, 2015 at 11:25 AM, Jan Schermer wrote: 
>> There were essentialy three things we had to do for such a drastic drop 
>> 
>> 1) recompile CEPH —without-tcmalloc 
>> 2) pin the OSDs to a set of a specific NUMA zone - we had this for a long time and it really helped 
>> 3) migrate the OSD memory to the correct CPU with migratepages 
>> - we will use cgroups in the future for this, should make life easier and is the only correct solution 
>> 
>> It is similiar to the effect of just restarting the OSD, but much better - since we immediately see hundreds of connections on a freshly restarted OSD (and in the benchmark the tcmalloc issue manifested with just two clients in parallel) I’d say we never saw the raw performance with tcmalloc (undegraded), but it was never this good - consistently low latencies, much smaller spikes when something happens and much lower CPU usage (about 50% savings but we’re also backfilling a lot on the background). Workloads are faster as well - like reweighting OSDs on that same node was much (hundreds of percent) faster. 
>> 
>> So far the effect has been drastic. I wonder why tcmalloc was even used when people are having problems with it? The glibc malloc seems to work just fine for us. 
>> 
>> The only concerning thing is the virtual memory usage - we are over 400GB VSS with a few OSDs. That doesn’t hurt anything, though. 
>> 
>> Jan 
>> 
>> 
>> On 24 Jun 2015, at 18:46, Robert LeBlanc wrote: 
>> 
>> - -----BEGIN PGP SIGNED MESSAGE----- 
>> Hash: SHA256 
>> 
>> Did you see what the effect of just restarting the OSDs before using tcmalloc? I've noticed that there is usually a good drop for us just by restarting them. I don't think it is usually this drastic. 
>> 
>> - - ---------------- 
>> Robert LeBlanc 
>> GPG Fingerprint 79A2 9CA4 6CC4 45DD A904 C70E E654 3BB2 FA62 B9F1 
>> 
>> On Wed, Jun 24, 2015 at 2:08 AM, Jan Schermer wrote: 
>> Can you guess when we did that? 
>> Still on dumpling, btw... 
>> 
>> http://www.zviratko.net/link/notcmalloc.png 
>> 
>> Jan 
>> 
>> _______________________________________________ 
>> ceph-users mailing list 
>> ceph-users@xxxxxxxxxxxxxx <mailto:ceph-users@xxxxxxxxxxxxxx> 
>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com 
>> 
>> 
>> - -----BEGIN PGP SIGNATURE----- 
>> Version: Mailvelope v0.13.1 
>> Comment:https://www.mailvelope.com <https://www.mailvelope.com/> 
>> 
>> wsFcBAEBCAAQBQJVit75CRDmVDuy+mK58QAAmjcP/jU+wyohdwKDP+FHDAgJ 
>> DcqdB5aPG2AM79iLcYUub5bQjdNJpcWN/hyZcNdF3aSzEV3aY6jIqu9OpOIB 
>> c2fIzfGOoczzW/FEf7qKRVGpxaQL21Sw1LpwMEscNe0ETz9HMHoaAnBO9IFn 
>> nUEOCdEpRBO5W1rWwNAx9EVnOUPklb7vVEpY23sgtHhQSprb9oeO8D99AMRz 
>> /RhdHKlRDgHBjun/stCiR6lFuvBUx0GBmyaMuO5rfsLGRIkySLv++3CLQI6X 
>> NCt/MjYwTTNNfO/y/MjkiV/j+Cm1G1lcjlgbDjilf7bgf8/7W2vJa1sMtaA4 
>> xJL+PpZxiKcGSdC96B+EBYxLhLcwsNpbfq7uxQOkIspa66mkIMAVzJgt4DFL 
>> Ca+UY3ODA26VtWF5U/hkdupgld+YSxXTyJakeShrBSFAX0a4cygV9Ll7SIhO 
>> IDS+0Mbur0IGzIWRgtCQhRXsc7wn3IoIovqe8Nfk4xupeoK2P5UHO1rW9pWy 
>> Jwj5PXieDqxgx8RKlulN1bCbSgTaEdveTiqqVxlnM9L0MhgesuB8vkpHbsqn 
>> mYJHNzU7ghU89xLnRuia9rBlpjw4OzagfowAJTH3UnaO67kxES+IWO8onQbN 
>> RhY0QR5cB5rVSjYkzzlsuLM17fQPcT8++yMarKdsrr6WIGppXUFFdATAqIaY 
>> DHD1 
>> =goL4 
>> - -----END PGP SIGNATURE----- 
>> 
>> 
>> -----BEGIN PGP SIGNATURE----- 
>> Version: Mailvelope v0.13.1 
>> Comment:https://www.mailvelope.com <https://www.mailvelope.com/> 
>> 
>> wsFcBAEBCAAQBQJViuu0CRDmVDuy+mK58QAASzoQAIf4Lj/jA2yl2XMS7RAW 
>> FmgK8rsf2iyzg6UQMmobFw0oWTb/0T4AscXlZIE7dhGUi6m6UHWBPB7P6YBZ 
>> UQ2eJqzcaK1Jf/flfTZajWB2z2CSYpuwbPYaQ8SqKoyauEjKgD092/LUfKL3 
>> TP5z7SdhZ8/HmzT2qFUdYuAQ+WvD3rgdJtkblFgItM+bqKmhibiZr3KHzXoU 
>> j5Ob61AsR6/s3hgWJ09uAghqB8SNsxJ0u7R5RnaiS2VWkHSHTrdiTwd/ONlL 
>> anBnKljTgkCSqS3RoPVB74qlqhDxlDnwRYvKrxurcikaI3tZ17xt4UvCc9yP 
>> RRH6M8aU1+7itOxu8DyOeZ+9Ev5/H6i5LwtrnN2pHaN9s0tWRKwzt5HQEYhE 
>> ceoyui+EtpN8zzqs9ryIGvHL3KB1bmL+0WWO4RlT8NwodsSge3Yga8KUMa07 
>> 8+dh0VGUywGEmxMg2VWPyvKf/keOiWHHi4UDJRgXJdnBjH/+4Yebva7TJ2b9 
>> Ch0r8JL00nbJCBb78dvw59XiFUJBFT5WfgItmbfjX2SI+srFaDXFKtGSjnFi 
>> MK4gE7DA70tKgP+xwpw3Eou7rDzxogqxnV54BlNzvbokbfiDAZ/ARL7CtC/1 
>> SnBxzEaliaJnBHKSgwOyP9sxz+QKMxty2ZTSmCnBUxKRK9O2hNSzFf6+1heT 
>> yQ3L 
>> =DreJ 
>> -----END PGP SIGNATURE----- 
> 
> 
> 
> _______________________________________________ 
> ceph-users mailing list 
> ceph-users@xxxxxxxxxxxxxx 
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com 
> 
_______________________________________________ 
ceph-users mailing list 
ceph-users@xxxxxxxxxxxxxx 
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com 
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com




[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux