On 11/4/2021 10:10 AM, Bart Van Assche wrote:
On 11/4/21 7:23 AM, Asutosh Das (asd) wrote:
In the current clock scaling code, the expectation is to scale up as
soon as possible.
For e.g. say, the current gear is G1 and there're pending requests in
the queue but the DBR is empty and there's a decision to scale up.
During scale-up, if the queues are frozen, wouldn't those requests be
issued to the driver and executed in G1 instead of G4?
I think this would lead to higher run to run variance in performance.
Hi Asutosh,
My understanding of the current clock scaling implementation is as
follows (please correct me if I got anything wrong):
* ufshcd_clock_scaling_prepare() is called before clock scaling happens
and ufshcd_clock_scaling_unprepare() is called after clock scaling has
finished.
* ufshcd_clock_scaling_prepare() calls ufshcd_scsi_block_requests() and
ufshcd_wait_for_doorbell_clr().
* ufshcd_wait_for_doorbell_clr() waits until both the UTP Transfer
Request List Doorbell Register and UTP Task Management Request List
DoorBell Register are zero. Hence, it waits until all pending SCSI
commands, task management commands and device commands have finished.
As far as I can see from a conceptual viewpoint there is no difference
between calling ufshcd_wait_for_doorbell_clr() or freezing the request
queues. There is an implementation difference however:
blk_mq_freeze_queue() waits for an RCU grace period. This can introduce
an additional delay of several milliseconds compared to
ufshcd_wait_for_doorbell_clr(). If this is a concern I can look into
expediting the RCU grace period during clock scaling.
Thanks,
Bart.
Hi Bart,
The current scaling code invokes scsi_block_request() which would block
incoming requests right away in prepare. So any un-issued requests
wouldn't be issued.
I'm not sure if this exact behavior would manifest if
blk_freeze_queue_start() is used and scsi_block_request() is removed.
I see that it invokes - blk_mq_run_hw_queues().
void blk_freeze_queue_start(struct request_queue *q)
{
mutex_lock(&q->mq_freeze_lock);
if (++q->mq_freeze_depth == 1) {
percpu_ref_kill(&q->q_usage_counter);
mutex_unlock(&q->mq_freeze_lock);
if (queue_is_mq(q))
blk_mq_run_hw_queues(q, false);
} else {
mutex_unlock(&q->mq_freeze_lock);
}
}
Would blk_mq_run_hw_queues() issue any pending requests in the queue? If
so, then these requests would be completed in the unchanged power-mode
which is not what we want.
-asd
--
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
Linux Foundation Collaborative Project