Re: SCSI layer RPM deadlock debug suggestion

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 7/5/21 3:17 PM, Alan Stern wrote:
On Mon, Jul 05, 2021 at 01:00:39PM +0100, John Garry wrote:
On 05/07/2021 00:45, Bart Van Assche wrote:

Hi Alan and Bart,

Thanks for the suggestions.

Removing commit e27829dc92e5 ("scsi: serialize ->rescan against ->remove")
solves this issue for me, but that is there for a reason.

Any suggestion on how to fix this deadlock?
This is indeed a tricky question.  It seems like we should allow a
runtime resume to succeed if the only reason it failed was that the
device has been removed.

More generally, perhaps we should always consider that a runtime
resume succeeds.  Any remaining problems will be dealt with by the
device's driver and subsystem once the device is marked as
runtime-active again.

Suppose you try changing blk_post_runtime_resume() so that it always
calls blk_set_runtime_active() regardless of the value of err.  Does
that fix the problem?

And more importantly, will it cause any other problems...?
That would cause trouble for the UFS driver and other drivers for which
runtime resume can fail due to e.g. the link between host and device
being in a bad state.

I don't understand how that could work.  If a device fails to resume
from runtime suspend, no matter whether the reason is temporary or
permanent, how can the system use it again?

And if the system can't use it again, what harm is there in pretending
that the runtime resume succeeded?

'xactly.
Especially as we _do_ have error recovery on SCSI, so we should be treating a failure to resume just like any other SCSI error; in the end, we need to equip SCSI EH to deal with these kind of states anyway. And we already do, as we're sending 'START STOP UNIT' already to spin up drives which are found to be spun down.

So I'm all for always returning 'success' from the 'resume' callback and let SCSI EH deal with any eventual fallout.

Cheers,

Hannes
--
Dr. Hannes Reinecke                Kernel Storage Architect
hare@xxxxxxx                              +49 911 74053 688
SUSE Software Solutions GmbH, Maxfeldstr. 5, 90409 Nürnberg
HRB 36809 (AG Nürnberg), Geschäftsführer: Felix Imendörffer



[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [SCSI Target Devel]     [Linux SCSI Target Infrastructure]     [Kernel Newbies]     [IDE]     [Security]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux ATA RAID]     [Linux IIO]     [Samba]     [Device Mapper]

  Powered by Linux