Hi all, This is a follow-up of a previous discussion in the performance weekly meeting on proxy write performance. Several months ago, I tested the performance of proxy write using fio with uniform random distribution. The performance gain is about 3.5x compared with non-proxywrite case. However, the uniform random workload is not an ideal workload for cache tiering. I learned from Mark/Sage that fio can generate non-uniform random (zipf/pareto) workload using some config options. I did the evaluations. Please allow me to report the results here. Configurations: - 2 ceph node, each with 1xIntel Xeon E3-1275 V2 @3.5GHz CPU, 32GB memory, 10Gb NIC. There are 8 HDDs and 6 intel DCS3700 SSDs on each node - 1 ceph client, with 2xIntel Xeon x5570 @2.93GHz CPU, 128GB memory and 10GB NIC, running 20 VMs, each VM runs fio on a RBD - Base pool: composed of 16 HDD OSDs, with 4 SSDs acting as the journal - Cache pool: composed of 8 SSD OSDs, journals are on the same SSD - Data set size: 20x20GB - Cache tier configurations: target_max_bytes 100GB, cache_target_dirty_ratio 0.4, cache_target_full_ratio 0.8, write recency 1 - Code version: proxy write code is at https://github.com/ceph/ceph/pull/3354. The without proxy write code version is on the same branch, but eliminating the proxy write commits. - Fio configuration: 4k random write, random_distribution zipf:1.1, ioengine libaio The top 10% of the data is hit over 80% of the time as generated by fio-genzipf. I don't quite understand what the '-b' option means, and fio-genzipf core dumps if I use 4096 for it. So I used 1000000, which seems to be the default. # ./fio-genzipf -t zipf -i 1.1 -b 1000000 -g 400 -o 10 Generating Zipf distribution with 1.100000 input and 400 GB size and 1000000 block_size. Rows Hits % Sum % # Hits Size ---------------------------------------------------------------------------------------------------------------------------- Top 10.00% 82.86% 82.86% 355883 331.44G |-> 20.00% 4.13% 86.99% 17725 16.51G |-> 30.00% 2.86% 89.85% 12278 11.43G |-> 40.00% 1.58% 91.43% 6781 6.32G |-> 50.00% 1.43% 92.85% 6139 5.72G |-> 60.00% 1.43% 94.28% 6139 5.72G |-> 70.00% 1.43% 95.71% 6139 5.72G |-> 80.00% 1.43% 97.14% 6139 5.72G |-> 90.00% 1.43% 98.57% 6139 5.72G |-> 100.00% 1.43% 100.00% 6134 5.71G --------------------------------------------------------------------------------------------------------------------------- Performance results: - Without proxy write results QD 1 2 4 8 16 IOPS 190 200 203 201 198 Latency (ms) 100.43 191.9 380.57 775.15 1600.66 - With proxy write results QD 1 2 4 8 16 IOPS 902 896 1067 1207 1486 Latency (ms) 22.08 44.43 74.83 133.12 217.42 As you can see from above, proxy write improves IOPS from ~200 up to ~1400, and reduces latency by about 80%. Any comments/feedbacks are welcomed. -- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html