On 7/16/2021 5:39 AM, Avri Altman wrote:
On 7/13/21 9:49 AM, Bart Van Assche wrote:
On 7/11/21 5:29 AM, Avri Altman wrote:
This patch is a performance improvement because it reduces the
number of atomic operations in the hot path (test_and_clear_bit()).
Both Can & Stanley reported a performance improvement of RR with
"Optimize host lock..".
Can those short numerical studies can be repeated with this patch?
I will measure the performance impact of this patch for rq_affinity=2
as soon as I have the time. As you may know we are close to an
internal deadline.
(replying to my own email)
Hi Avri,
The performance I measure with the current upstream UFS driver is 61.0 K IOPS.
With a variant of this patch (outstanding_reqs protected with a new spinlock
instead of the host lock), I see 62.0 K IOPS. In other words, this patch realizes a
small performance improvement. This is what I had expected since this patch
reduces the number of atomic operations involved in updating
outstanding_reqs.
Thank you for taking the time and running this.
But does your platform make use of REG_UTP_TRANSFER_REQ_LIST_COMPL?
With 60k IOPS I suspect it doesn't, and the comparison is irrelevant.
Thanks,
Avri
I agree. We saw substantial improvement with RR and RW too with the
'Optimize host lock change'.
Hi Bart,
Is it possible to check the performance data with these changes on
Android, say using Androbench?
--
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
Linux Foundation Collaborative Project