On 2024-11-01 16:23:42, liqiang wrote: >We create a lock-less link list for the currently >idle reusable smc_buf_desc. > >When the 'used' filed mark to 0, it is added to >the lock-less linked list. > >When a new connection is established, a suitable >element is obtained directly, which eliminates the >need for traversal and search, and does not require >locking resource. > >A lock-less linked list is a linked list that uses >atomic operations to optimize the producer-consumer model. > >I didn't find a suitable public benchmark, so I tested the >time-consuming comparison of this function under multiple >connections based on redis-benchmark (test in smc loopback-ism mode): I think you can run test wrk/nginx test with short-lived connection. For example: ``` # client wrk -H "Connection: close" http://$serverIp # server nginx ``` > > 1. On the current version: > [x.832733] smc_buf_get_slot cost:602 ns, walk 10 buf_descs > [x.832860] smc_buf_get_slot cost:329 ns, walk 12 buf_descs > [x.832999] smc_buf_get_slot cost:479 ns, walk 17 buf_descs > [x.833157] smc_buf_get_slot cost:679 ns, walk 13 buf_descs > ... > [x.045240] smc_buf_get_slot cost:5528 ns, walk 196 buf_descs > [x.045389] smc_buf_get_slot cost:4721 ns, walk 197 buf_descs > [x.045537] smc_buf_get_slot cost:4075 ns, walk 198 buf_descs > [x.046010] smc_buf_get_slot cost:6476 ns, walk 199 buf_descs > > 2. Apply this patch: > [x.180857] smc_buf_get_slot_free cost:75 ns > [x.181001] smc_buf_get_slot_free cost:147 ns > [x.181128] smc_buf_get_slot_free cost:97 ns > [x.181282] smc_buf_get_slot_free cost:132 ns > [x.181451] smc_buf_get_slot_free cost:74 ns > >It can be seen from the data that it takes about 5~6us to traverse 200 >times, and the time complexity of the lock-less linked algorithm is O(1). > >And my test process is only single-threaded. If multiple threads >establish SMC connections in parallel, locks will also become a >bottleneck, and lock-less linked can solve this problem well. > >SO I guess this patch should be beneficial in scenarios where a >large number of short connections are parallel? Based on your data, I'm afraid the short-lived connection test won't show much benificial. Since the time to complete a SMC-R connection should be several orders of magnitude larger than 100ns. Best regards, Dust