Re: Degradation of write-performance after upgrading to Octopus

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Stephan,


We recently ran a set of 3-sample tests looking at 2OSD/NVMe vs 1 OSD/NVMe RBD performance on Nautilus, Octopus, and Master on some of our newer performance nodes with Intel P4510 NVMe drives. Those tests use the librbd fio backend.  We also saw similar randread and seq write performance increases but did not see a performance regression with 4KB random writes like you did.  In fact Octopus was significantly faster than Nautilus (but master regressed a little vs octopus).  We expect it to be significantly faster too as we improved the way the bluestore caches work and it's consistently shown gains for us.  Here are the most recent test results:


https://docs.google.com/spreadsheets/d/1e5eTeHdZnSizoY6AUjH0knb4jTCW7KMU4RoryLX9EHQ/edit?usp=sharing


Having said that, this is the second report I've gotten regarding performance regression in Octopus so there could be something going on that we are missing.  If possible, could you run gdbpmp against one of your OSDs during the test?  That might help us figure out why it's slow.  Otherwise some other things to look at:


1) If this is a large dataset, see if increasing the osd_memory_target helps.  onode cache misses really hurt us and can increase latency and hurt IOPS.  Now that Adam's column family sharding PR has merged in master we have two complimentary PRs the both help reduce OSD memory consumption for caching onodes.  For now you might see higher performance if you can afford to give the OSDs more memory.

2) Check to see if the CPUs are being kept in a high power state.  The transition can cause higher latency and perversely the less CPU you use the more likely the CPU is to drop into a low power state resulting in higher latency and worse performance, especially if it ends up thrashing between power states.

3) Lately I haven't seen the kv sync thread acting as a hard bottleneck during 4KB random writes, but it still could be if you have a low clocked processor (especially in a power saving state).  This is still an area to look carefully at if performance is low.

4) the bluefs_buffered_io change was the other thing I suspected but it sounds like you've already tested that.  never-the-less it would be good to see if IOs are backing up.  If you can get a wall clock profile with gdbpmp you might be able to tell if io_submit is blocking.  iostat or collectl can also probably tell you if the device queue is backing up.


Hope this gives some ideas to start out!


Thanks,

Mark


On 6/4/20 10:07 AM, Stephan wrote:
Thanks for your fast reply! We just tried all four possible combinations of bluefs_preextend_wal_files and bluefs_buffered_io, but the write-iops in test "usecase1" remain the same. By the way  bluefs_preextend_wal_files has been false in 14.2.9 (as in 15.2.3). Any other ideas?

David Orman wrote:
* bluestore: common/options.cc: disable bluefs_preextend_wal_files  <--
from 15.2.3 changelogs. There was a bug which lead to issues on OSD
restart, and I believe this was the attempt at mitigation until a proper
bugfix could be put into place. I suspect this might be the cause of the
symptoms you're seeing.

https://tracker.ceph.com/issues/45613
https://github.com/ceph/ceph/pull/35293
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx

_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx




[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux