On Tue, Mar 09, 2021 at 08:04:53PM -0800, Asutosh Das (asd) wrote: > On 3/9/2021 7:14 PM, Alan Stern wrote: > > On Tue, Mar 09, 2021 at 07:04:34PM -0800, Asutosh Das (asd) wrote: > > > Hello > > > I & Can (thanks CanG) debugged this further: > > > > > > Looks like this issue can occur if the sd probe is asynchronous. > > > > > > Essentially, the sd_probe() is done asynchronously and driver_probe_device() > > > invokes pm_runtime_get_suppliers() before invoking sd_probe(). > > > > > > But scsi_probe_and_add_lun() runs in a separate context. > > > So the scsi_autopm_put_device() invoked from scsi_scan_host() context > > > reduces the link->rpm_active to 1. And sd_probe() invokes > > > scsi_autopm_put_device() and starts a timer. And then driver_probe_device() > > > invoked from __device_attach_async_helper context reduces the > > > link->rpm_active to 1 thus enabling the supplier to suspend before the > > > consumer suspends. > > > > > I don't see a way around this. Please let me know if you > > > (@Alan/@Bart/@Adrian) have any thoughts on this. > > > > How about changing the SCSI core so that it does a runtime_get before > > starting an async probe, and the async probe routine does a > > runtime_put when it is finished? In other words, don't allow a device > > to go into runtime suspend while it is waiting to be probed. > > > > I don't think that would be too intrusive. > > > > Alan Stern > > > > Hi Alan > Thanks for the suggestion. > > Am trying to understand: > > Do you mean something like this: > > int scsi_sysfs_add_sdev(struct scsi_device *sdev) > { > > scsi_autopm_get_device(sdev); > pm_runtime_get_noresume(&sdev->sdev_gendev); > [...] > scsi_autopm_put_device(sdev); > [...] > } > > static int sd_probe(struct device *dev) > { > [...] > pm_runtime_put_noidle(dev); > scsi_autopm_put_device(sdp); > [...] > } > > This may work (I'm limited by my imagination in scsi layer :) ). I'm not sure about this. To be honest, I did not read the entirety of your last message; it had way too much detail. THere's a time and place for that, but when you're brainstorming to figure out the underlying cause of a problem and come up with a strategy to fix it, you want to concentrate on the overall picture, not the details. As I understand the situation, you've get a SCSI target with multiple logical units, let's say A and B, and you need to make sure that A never goes into runtime suspend unless B is already suspended. In other words, B always has to suspend before A and resume after A. To do this, you register a device link with A as the supplier and B as the consumer. Then the PM core takes care of the ordering for you. But I don't understand when you set up the device link. If the timing is wrong then, thanks to async SCSI probing, you may have a situation where A is registered before B and before the link is set up. Then there's temporarily nothing to stop A from suspending before B. You also need to prevent each device from suspending before it is probed. That's the easy part I was trying to address before (although it may not be so easy if the drivers are in loadable modules and not present in the kernel). You need to think through these issues before proposing actual changes. > But the pm_runtime_put_noidle() would have to be added to all registered > scsi_driver{}, perhaps? Or may be I can check for sdp->type? Like this; it's too early to worry about this sort of thing. Alan Stern