Re: [PATCH v3 01/23] ata: libata-core: Fix ata_port_request_pm() locking

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 2023/09/20 0:21, Niklas Cassel wrote:
> On Tue, Sep 19, 2023 at 09:31:04AM -0700, Damien Le Moal wrote:
>> On 2023/09/19 6:21, Niklas Cassel wrote:
>>> On Fri, Sep 15, 2023 at 05:14:45PM +0900, Damien Le Moal wrote:
>>>> The function ata_port_request_pm() checks the port flag
>>>> ATA_PFLAG_PM_PENDING and calls ata_port_wait_eh() if this flag is set to
>>>> ensure that power management operations for a port are not secheduled
>>>
>>> s/secheduled/scheduled/
>>>
>>>> simultaneously. However, this flag check is done without holding the
>>>> port lock.
>>>>
>>>> Fix this by taking the port lock on entry to the function and checking
>>>> the flag under this lock. The lock is released and re-taken if
>>>> ata_port_wait_eh() needs to be called.
>>>>
>>>> Fixes: 5ef41082912b ("ata: add ata port system PM callbacks")
>>>> Cc: stable@xxxxxxxxxxxxxxx
>>>> Signed-off-by: Damien Le Moal <dlemoal@xxxxxxxxxx>
>>>> Reviewed-by: Hannes Reinecke <hare@xxxxxxx>
>>>> Tested-by: Chia-Lin Kao (AceLan) <acelan.kao@xxxxxxxxxxxxx>
>>>> ---
>>>>  drivers/ata/libata-core.c | 17 +++++++++--------
>>>>  1 file changed, 9 insertions(+), 8 deletions(-)
>>>>
>>>> diff --git a/drivers/ata/libata-core.c b/drivers/ata/libata-core.c
>>>> index 74314311295f..c4898483d716 100644
>>>> --- a/drivers/ata/libata-core.c
>>>> +++ b/drivers/ata/libata-core.c
>>>> @@ -5040,17 +5040,20 @@ static void ata_port_request_pm(struct ata_port *ap, pm_message_t mesg,
>>>>  	struct ata_link *link;
>>>>  	unsigned long flags;
>>>>  
>>>> -	/* Previous resume operation might still be in
>>>> -	 * progress.  Wait for PM_PENDING to clear.
>>>> +	spin_lock_irqsave(ap->lock, flags);
>>>> +
>>>> +	/*
>>>> +	 * A previous PM operation might still be in progress. Wait for
>>>> +	 * ATA_PFLAG_PM_PENDING to clear.
>>>>  	 */
>>>>  	if (ap->pflags & ATA_PFLAG_PM_PENDING) {
>>>> +		spin_unlock_irqrestore(ap->lock, flags);
>>>>  		ata_port_wait_eh(ap);
>>>> +		spin_lock_irqsave(ap->lock, flags);
>>>>  		WARN_ON(ap->pflags & ATA_PFLAG_PM_PENDING);
>>>>  	}
>>>>  
>>>> -	/* request PM ops to EH */
>>>> -	spin_lock_irqsave(ap->lock, flags);
>>>> -
>>>> +	/* Request PM operation to EH */
>>>>  	ap->pm_mesg = mesg;
>>>>  	ap->pflags |= ATA_PFLAG_PM_PENDING;
>>>>  	ata_for_each_link(link, ap, HOST_FIRST) {
>>>> @@ -5062,10 +5065,8 @@ static void ata_port_request_pm(struct ata_port *ap, pm_message_t mesg,
>>>>  
>>>>  	spin_unlock_irqrestore(ap->lock, flags);
>>>>  
>>>> -	if (!async) {
>>>> +	if (!async)
>>>>  		ata_port_wait_eh(ap);
>>>> -		WARN_ON(ap->pflags & ATA_PFLAG_PM_PENDING);
>>>
>>> Perhaps you should mention why this WARN_ON() is removed in the commit
>>> message.
>>>
>>> I don't understand why you keep the WARN_ON() higher up in this function,
>>> but remove this WARN_ON(). They seem to have equal worth to me.
>>> Perhaps just take and release the lock around the WARN_ON() here as well?
>>
>> Yes, they have the same worth == not super useful... I kept the one higher up as
>> it is OK because we hold the lock, but removed the second one as checking pflags
>> without the lock is just plain wrong. Thinking of it, the first WRN_ON() is also
>> wrong I think because EH could be rescheduled right after wait_eh and before we
>> take the lock. In that case, the warn on would be a flase positive. I will
>> remove it as well.
> 
> We are checking if ATA_PFLAG_PM_PENDING is set, if it is, we do
> ata_port_wait_eh(), which will wait until both ATA_PFLAG_EH_PENDING and
> ATA_PFLAG_EH_IN_PROGRESS is cleared.
> 
> Note that ATA_PFLAG_PM_PENDING and ATA_PFLAG_EH_PENDING have very similar
> names... I really think we should rename ATA_PFLAG_PM_PENDING to something
> like ATA_PFLAG_EH_PM_PENDING (the PM is performed by EH), in order to make
> it harder to mix them up.

Let's do the renaming in a followup patch, not in this fix patch.

> 
> Since the only place that sets ATA_PFLAG_PM_PENDING is ata_port_request_pm()
> and since PM core holds the device lock (device_lock()), I don't think that
> ATA_PFLAG_PM_PENDING can get set while inside ata_port_request_pm().
> And since we wait for EH to complete, and since both
> ata_eh_handle_port_suspend() and ata_eh_handle_port_resume() are called
> unconditionally by EH, they will only return if ATA_PFLAG_PM_PENDING is not
> set, and since these functions both clear ATA_PFLAG_PM_PENDING unconditionally,
> I would agree with you, that these two WARN_ON() seem superfluous.
> 
> (Yes, EH could trigger again if we got an error IRQ before ata_port_request_pm()
> takes the lock the second time, but that can only set ATA_PFLAG_EH_PENDING,
> it can not set ATA_PFLAG_PM_PENDING.)
> 
> 
> Kind regards,
> Niklas

-- 
Damien Le Moal
Western Digital Research




[Index of Archives]     [Linux Filesystems]     [Linux SCSI]     [Linux RAID]     [Git]     [Kernel Newbies]     [Linux Newbie]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Samba]     [Device Mapper]

  Powered by Linux