Re: [PATCH RFC] scsi: ufs-core: Do not use clk_scaling_lock in ufshcd_queuecommand()

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 11/4/2021 10:10 AM, Bart Van Assche wrote:
On 11/4/21 7:23 AM, Asutosh Das (asd) wrote:
In the current clock scaling code, the expectation is to scale up as soon as possible.

For e.g. say, the current gear is G1 and there're pending requests in the queue but the DBR is empty and there's a decision to scale up. During scale-up, if the queues are frozen, wouldn't those requests be issued to the driver and executed in G1 instead of G4?
I think this would lead to higher run to run variance in performance.

Hi Asutosh,

My understanding of the current clock scaling implementation is as follows (please correct me if I got anything wrong):
* ufshcd_clock_scaling_prepare() is called before clock scaling happens
   and ufshcd_clock_scaling_unprepare() is called after clock scaling has
   finished.
* ufshcd_clock_scaling_prepare() calls ufshcd_scsi_block_requests() and
   ufshcd_wait_for_doorbell_clr().
* ufshcd_wait_for_doorbell_clr() waits until both the UTP Transfer
   Request List Doorbell Register and UTP Task Management Request List
   DoorBell Register are zero. Hence, it waits until all pending SCSI
   commands, task management commands and device commands have finished.

As far as I can see from a conceptual viewpoint there is no difference
between calling ufshcd_wait_for_doorbell_clr() or freezing the request
queues. There is an implementation difference however: blk_mq_freeze_queue() waits for an RCU grace period. This can introduce an additional delay of several milliseconds compared to ufshcd_wait_for_doorbell_clr(). If this is a concern I can look into expediting the RCU grace period during clock scaling.

Thanks,

Bart.
Hi Bart,
The current scaling code invokes scsi_block_request() which would block incoming requests right away in prepare. So any un-issued requests wouldn't be issued.

I'm not sure if this exact behavior would manifest if blk_freeze_queue_start() is used and scsi_block_request() is removed.
I see that it invokes - blk_mq_run_hw_queues().
void blk_freeze_queue_start(struct request_queue *q)
{
        mutex_lock(&q->mq_freeze_lock);
        if (++q->mq_freeze_depth == 1) {
                percpu_ref_kill(&q->q_usage_counter);
                mutex_unlock(&q->mq_freeze_lock);
                if (queue_is_mq(q))
                        blk_mq_run_hw_queues(q, false);
        } else {
                mutex_unlock(&q->mq_freeze_lock);
        }
}

Would blk_mq_run_hw_queues() issue any pending requests in the queue? If so, then these requests would be completed in the unchanged power-mode which is not what we want.

-asd

--
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
Linux Foundation Collaborative Project



[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [SCSI Target Devel]     [Linux SCSI Target Infrastructure]     [Kernel Newbies]     [IDE]     [Security]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux ATA RAID]     [Linux IIO]     [Samba]     [Device Mapper]

  Powered by Linux