RE: Bluestore performance bottleneck

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



The two data points you mention (4K / 16K MinAlloc) yield interesting numbers. For 4K, you're seeing 22.5K IOPS at 1300% CPU or 1.7K IOPS / core. Yet for 16K you're seeing 25K IOPS at 1000% CPU or 2.5 K IOPS/Core.  Yet, we know that in the main I/O path that the 16K is doing more work (since it's double-writing the data), but is yielding better CPU usage overall. We do know that there will be a reduction of compaction for the 16K case which will save SOME CPU, but I wouldn't have thought that this would be substantial since the data is all processed sequentially in rather large blocks (i.e., the CPU cost of compaction seems to be larger than expected). 

Do we know that you're actually capturing a few compaction cycles with the 16K test? If not, that might explain some of the difference.


Allen Samuels
SanDisk |a Western Digital brand
2880 Junction Avenue, San Jose, CA 95134
T: +1 408 801 7030| M: +1 408 780 6416
allen.samuels@xxxxxxxxxxx


> -----Original Message-----
> From: ceph-devel-owner@xxxxxxxxxxxxxxx [mailto:ceph-devel-
> owner@xxxxxxxxxxxxxxx] On Behalf Of Mark Nelson
> Sent: Friday, December 23, 2016 9:09 AM
> To: Sage Weil <sweil@xxxxxxxxxx>
> Cc: Somnath Roy <Somnath.Roy@xxxxxxxxxxx>; ceph-devel <ceph-
> devel@xxxxxxxxxxxxxxx>
> Subject: Re: Bluestore performance bottleneck
> 
> >> Try this?
> >>     https://github.com/ceph/ceph/pull/12634
> >
> > Looks like this is most likely reducing the memory usage and
> > increasing performance quite a bit with smaller shard target/max
> > values.  With
> > 25/50 I'm seeing more like 2.6GB RSS memory usage and around 13K iops
> > typically with some (likely rocksdb) stalls.  I'll run through the
> > tests again.
> >
> > Mark
> >
> 
> Ok, Ran through tests with both 4k and 16k min_alloc/max_alloc/blob sizes
> using master+12629+12634:
> 
> https://drive.google.com/uc?export=download&id=0B2gTBZrkrnpZQzdRU3B
> 1SGZUbDQ
> 
> Performance is up in all tests and memory consumption is down (especially in
> the smaller target/max tests).  It looks like 100/200 is probably the current
> optimal configuration on my test setup.  4K min_alloc tests hover around
> 22.5K IOPS with ~1300% CPU usage, and 16K min_alloc tests hover around
> 25K IOPs with ~1000% CPU usage.  I think it will be worth spending some time
> looking at locking in the bitmap allocator given the perf traces.  Beyond that,
> I'm seeing rocksdb show up quite a bit in the top CPU consuming functions
> now, especially CRC32.
> 
> Mark
> 
> 
> --
> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the
> body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at
> http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html




[Index of Archives]     [CEPH Users]     [Ceph Large]     [Information on CEPH]     [Linux BTRFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]
  Powered by Linux