On 7/16/23 19:57, Hao Xu wrote:
On 7/13/23 23:14, Christian Brauner wrote:
Could someone with perf experience try and remove that f_count == 1
optimization from __fdget_pos() completely and make it always acquire
the mutex? I wonder what the performance impact of that is.
Hi Christian,
For your reference, I did a simple test: writed a c program that open a
directory which has 1000 empty files, then call sync getdents64 on it
repeatedly until we get all the entries. I run this program 10 times for
"with f_count==1
optimization" and "always do the lock" version.
Got below data:
with f_count==1:
time cost: 0.000379
time cost: 0.000116
time cost: 0.000090
time cost: 0.000101
time cost: 0.000095
time cost: 0.000092
time cost: 0.000092
time cost: 0.000095
time cost: 0.000092
time cost: 0.000121
time cost: 0.000092
time cost avg: 0.00009859999999999998
always do the lock:
time cost: 0.000095
time cost: 0.000099
time cost: 0.000123
time cost: 0.000124
time cost: 0.000092
time cost: 0.000099
time cost: 0.000092
time cost: 0.000092
time cost: 0.000093
time cost: 0.000094
time cost avg: 0.00010029999999999997
So about 1.724% increment
[1] the first run is not showed here since that try does real IO
and diff a lot.
[2] the time cost calculation is by gettimeofday()
[3] run it in a VM which has 2 CPUs and 1GB memory.
Regards,
Hao
Did another similar test for more times(100 rounds), about 1.4%
increment. How about:
if CONFIG_IO_URING: remove the f_count==1 logic
else: do the old logic.