Hi, Increasing the number of the kv_sync_threads is not giving much of performance. In the current threading model, shard_worker submits the IO to the block device which are handled by aio_callback thread(which is one) and submits to the kv_sync thread, which batches the requests and submits to the rocksdb. Because kv_sync batches the requests and submits the requests, we might observe more time spent on kv_sync_thread routine. And i haven't observed much of an improvement by adding more threads here. But when increased the number of callbacks thread from aio(still needs some refinements in polling for the request completions) and completing the write completion in the same thread context increased some performance. I don't have the numbers to say how much, but that is better than having multiple kv_sync threads, adding one more queue and lock. You can refer to https://github.com/varadakari/ceph/commits/wip-parallel-aiocb (ignore the first commit, was trying to do sync transaction in the same thread context of sharded worker to measure the latency). was exploring a way to have the aio callback thread matached/reserved at the time of io submission, so that we don't need to do io_getevents(), kind of a async callback to the specified thread so that we can avoid some waiting logic in io_getevents() and process the request in the same thread context. You can refer to http://manpages.ubuntu.com/manpages/wily/man3/io_set_callback.3.html. I don't have the working code ready for this. FWIW, that is worth experimenting and see if it reduces any latency. Varada On Thursday 25 August 2016 01:25 PM, Haomai Wang wrote: > looks very litlle improvements. rocksdb result meet my expectation > because rocksdb internal has lock for multi sync write. But memdb > improments is a little confusing. > > On Thu, Aug 25, 2016 at 3:45 PM, Tang, Haodong <haodong.tang@xxxxxxxxx> wrote: >> Hi Sage, Varada >> >> Noticed you are making parallel transaction submits, we also worked out a prototype that looks similar, here is the link for the implementation: https://github.com/ceph/ceph/pull/10856 >> >> Background: >> From the perf counter we added, found it spent a lot time in kv_queue, that is, single thread transaction submits is not competent to handle the transaction from OSD. >> >> Implementation: >> The key thought is to use multiple thread and assign each TransContext to one of the processing threads. In order to parallelize transaction submit, add different kv_locks and kv_conds for each thread. >> >> Performance evaluation: >> Test ENV: >> 4 x server, 4 x client, 16 x Intel S3700 as block device, and 4 x Intel P3600 as Rocksdb/WAL device. >> Performance: >> We also did several quick tests to verify the performance benefit, the results showed that parallel transaction submission will brought 10% performance improvement if using memdb, but little performance improvement with rocksdb. >> >> What's more, without parallel transaction submits, we also see performance boost if just changing to MemDB, but a little. >> >> Test summary: >> QD Scaling Test - 4k Random Write: >> QD = 1 QD = 16 QD = 32 QD = 64 QD = 128 >> With rocksdb (IOPS) 682 173000 190000 203000 204000 >> With memdb (IOPS) 704 180000 194000 206000 218000 >> With rocksdb+multiple_kv_thread(IOPS) / 164243 167037 180961 201752 >> With memdb+multiple_kv_thread(IOPS) / 176000 200000 221000 227000 >> >> >> It seems single thread of transaction submits will be a bottleneck if using MemDB. >> -- >> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in >> the body of a message to majordomo@xxxxxxxxxxxxxxx >> More majordomo info at http://vger.kernel.org/majordomo-info.html PLEASE NOTE: The information contained in this electronic mail message is intended only for the use of the designated recipient(s) named above. If the reader of this message is not the intended recipient, you are hereby notified that you have received this message in error and that any review, dissemination, distribution, or copying of this message is strictly prohibited. If you have received this communication in error, please notify the sender by telephone or e-mail (as shown above) immediately and destroy any and all copies of this message in your possession (whether hard copies or electronically stored copies). -- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html