Re: [PATCH 3/3] io_uring: add support for getdents

Hao Xu <hao.xu@xxxxxxxxx> · Sun, 16 Jul 2023 19:57:26 +0800

On 7/13/23 23:14, Christian Brauner wrote:

Could someone with perf experience try and remove that f_count == 1
optimization from __fdget_pos() completely and make it always acquire
the mutex? I wonder what the performance impact of that is.

Hi Christian,
For your reference, I did a simple test: writed a c program that open a
directory which has 1000 empty files, then call sync getdents64 on it
repeatedly until we get all the entries. I run this program 10 times for
"with f_count==1
optimization" and "always do the lock" version.
Got below data:
with f_count==1:

time cost: 0.000379 

time cost: 0.000116 

time cost: 0.000090 

time cost: 0.000101 

time cost: 0.000095 

time cost: 0.000092 

time cost: 0.000092 

time cost: 0.000095 

time cost: 0.000092 

time cost: 0.000121 

time cost: 0.000092 

time cost avg: 0.00009859999999999998

always do the lock:
time cost: 0.000095 

time cost: 0.000099 

time cost: 0.000123 

time cost: 0.000124 

time cost: 0.000092 

time cost: 0.000099 

time cost: 0.000092 

time cost: 0.000092 

time cost: 0.000093 

time cost: 0.000094 

            time cost avg: 0.00010029999999999997

So about 1.724% increment

[1] the first run is not showed here since that try does real IO
    and diff a lot.
[2] the time cost calculation is by gettimeofday()
[3] run it in a VM which has 2 CPUs and 1GB memory.

Regards,
Hao