Proxy write performance on fio with zipf distribution

"Wang, Zhiqiang" <zhiqiang.wang@xxxxxxxxx> · Mon, 18 May 2015 07:29:04 +0000

Hi all,

This is a follow-up of a previous discussion in the performance weekly meeting on proxy write performance. Several months ago, I tested the performance of proxy write using fio with uniform random distribution. The performance gain is about 3.5x compared with non-proxywrite case. However, the uniform random workload is not an ideal workload for cache tiering. I learned from Mark/Sage that fio can generate non-uniform random (zipf/pareto) workload using some config options. I did the evaluations. Please allow me to report the results here.

Configurations:
- 2 ceph node, each with 1xIntel Xeon E3-1275 V2 @3.5GHz CPU, 32GB memory, 10Gb NIC. There are 8 HDDs and 6 intel DCS3700 SSDs on each node
- 1 ceph client, with 2xIntel Xeon x5570 @2.93GHz CPU, 128GB memory and 10GB NIC, running 20 VMs, each VM runs fio on a RBD
- Base pool: composed of 16 HDD OSDs, with 4 SSDs acting as the journal
- Cache pool: composed of 8 SSD OSDs, journals are on the same SSD
- Data set size: 20x20GB
- Cache tier configurations: target_max_bytes 100GB, cache_target_dirty_ratio 0.4, cache_target_full_ratio 0.8, write recency 1
- Code version: proxy write code is at https://github.com/ceph/ceph/pull/3354. The without proxy write code version is on the same branch, but eliminating the proxy write commits.
- Fio configuration: 4k random write, random_distribution zipf:1.1, ioengine libaio
The top 10% of the data is hit over 80% of the time as generated by fio-genzipf. I don't quite understand what the '-b' option means, and fio-genzipf core dumps if I use 4096 for it. So I used 1000000, which seems to be the default.
# ./fio-genzipf -t zipf -i 1.1 -b 1000000 -g 400 -o 10
Generating Zipf distribution with 1.100000 input and 400 GB size and 1000000 block_size.

   Rows           Hits %         Sum %           # Hits          Size
----------------------------------------------------------------------------------------------------------------------------
Top  10.00%      82.86%          82.86%           355883        331.44G
|->  20.00%       4.13%          86.99%            17725         16.51G
|->  30.00%       2.86%          89.85%            12278         11.43G
|->  40.00%       1.58%          91.43%             6781          6.32G
|->  50.00%       1.43%          92.85%             6139          5.72G
|->  60.00%       1.43%          94.28%             6139          5.72G
|->  70.00%       1.43%          95.71%             6139          5.72G
|->  80.00%       1.43%          97.14%             6139          5.72G
|->  90.00%       1.43%          98.57%             6139          5.72G
|-> 100.00%       1.43%         100.00%             6134          5.71G
---------------------------------------------------------------------------------------------------------------------------

Performance results:

- Without proxy write results
QD			1		2		4		8		16
IOPS			190		200		203		201		198
Latency (ms)	100.43	191.9	380.57	775.15	1600.66

- With proxy write results
QD			1		2		4		8		16
IOPS			902		896		1067	1207	1486
Latency (ms)	22.08	44.43	74.83	133.12	217.42

As you can see from above, proxy write improves IOPS from ~200 up to ~1400, and reduces latency by about 80%.

Any comments/feedbacks are welcomed.
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html