> On 7/13/21 9:49 AM, Bart Van Assche wrote: > > On 7/11/21 5:29 AM, Avri Altman wrote: > >>> This patch is a performance improvement because it reduces the > >>> number of atomic operations in the hot path (test_and_clear_bit()). > >> Both Can & Stanley reported a performance improvement of RR with > >> "Optimize host lock..". > >> Can those short numerical studies can be repeated with this patch? > > > > I will measure the performance impact of this patch for rq_affinity=2 > > as soon as I have the time. As you may know we are close to an > > internal deadline. > > (replying to my own email) > > Hi Avri, > > The performance I measure with the current upstream UFS driver is 61.0 K IOPS. > With a variant of this patch (outstanding_reqs protected with a new spinlock > instead of the host lock), I see 62.0 K IOPS. In other words, this patch realizes a > small performance improvement. This is what I had expected since this patch > reduces the number of atomic operations involved in updating > outstanding_reqs. Thank you for taking the time and running this. But does your platform make use of REG_UTP_TRANSFER_REQ_LIST_COMPL? With 60k IOPS I suspect it doesn't, and the comparison is irrelevant. Thanks, Avri