bluestore latency modeling, throughput plots

"J. Eric Ivancich" <ivancich@xxxxxxxxxx> · Fri, 10 Feb 2017 10:05:10 -0500

So I looked at the throughput of a kv-sync batch. The bytes in throttle
are the number of bytes that were in the throttle (no throttling is
actually taking place, however the data that would be controlling the
throttle is noted) when the earliest TransContext of the block was
created (in other words, the # of bytes in the batch aren't added to
this). The time is calculated as the difference in when the kv-sync
batch was synced and the time that the earliest TransContext in the
batch was created. And the bytes is the total number of bytes in the
kv-sync batch.

The data is generated using bluestore on a hard disk.

Here are the plots with work generated by rados bench with varying
number of concurrent ops. Please note that the scales on the axes can vary.

throughput @ concurrency=16
http://tracker.ceph.com/attachments/download/2707/regbench-m-t16-kv-sync-throughput.pdf

throughput @ concurrency=32
http://tracker.ceph.com/attachments/download/2708/regbench-m-t32-kv-sync-throughput.pdf

throughput @ concurrency=64
http://tracker.ceph.com/attachments/download/2709/regbench-m-t64-kv-sync-throughput.pdf

throughput @ concurrency=128
http://tracker.ceph.com/attachments/download/2710/regbench-m-t128-kv-sync-throughput.pdf

throughput @ concurrency=all-overlaid
http://tracker.ceph.com/attachments/download/2711/regbench-m-tmulti-kv-sync-throughput.pdf

I don't think the data is too surprising. With fewer bytes in the
throttle we have some high throughput numbers, but throughput quickly
settles into the 400000-500000 bytes/nanosecond range at the peak. The
peak then starts to descend as the number of bytes in the throttle
increases.

Since except for a few stragglers, the most dense area of points is
below the 600000 bytes/ns area, here are the same four graphs where we
limit the vertical axis to 600000. The scales on the horizontal axes can
still vary.

throughput @ concurrency=16
http://tracker.ceph.com/attachments/download/2712/regbench-m-t16-kv-sync-throughput-pin.pdf

throughput @ concurrency=32
http://tracker.ceph.com/attachments/download/2713/regbench-m-t32-kv-sync-throughput-pin.pdf

throughput @ concurrency=64
http://tracker.ceph.com/attachments/download/2714/regbench-m-t64-kv-sync-throughput-pin.pdf

throughput @ concurrency=128
http://tracker.ceph.com/attachments/download/2715/regbench-m-t128-kv-sync-throughput-pin.pdf

throughput @ concurrency=all-overlaid
http://tracker.ceph.com/attachments/download/2716/regbench-m-tmulti-kv-sync-throughput-pin.pdf

And the downward trend in throughput as the bytes that would be in the
throttle gets larger is even more pronounced. I'm going to fit
polynomial curves to this data next.

Eric
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html