> From: linux-hyperv-owner@xxxxxxxxxxxxxxx > <linux-hyperv-owner@xxxxxxxxxxxxxxx> On Behalf Of Dexuan Cui > Sent: Saturday, April 11, 2020 10:20 AM > To: Bart Van Assche <bvanassche@xxxxxxx>; Ming Lei > <tom.leiming@xxxxxxxxx>; Martin K. Petersen <martin.petersen@xxxxxxxxxx> > Cc: linux-scsi@xxxxxxxxxxxxxxx; Christoph Hellwig <hch@xxxxxx>; > linux-hyperv@xxxxxxxxxxxxxxx; Long Li <longli@xxxxxxxxxxxxx>; vkuznets > <vkuznets@xxxxxxxxxx>; Michael Kelley <mikelley@xxxxxxxxxxxxx>; KY > Srinivasan <kys@xxxxxxxxxxxxx>; Olaf Hering <olaf@xxxxxxxxx>; Stephen > Hemminger <sthemmin@xxxxxxxxxxxxx>; linux-kernel@xxxxxxxxxxxxxxx > Subject: RE: SCSI low level driver: how to prevent I/O upon hibernation? > > > From: Bart Van Assche <bvanassche@xxxxxxx> > > Sent: Saturday, April 11, 2020 8:03 AM > > > > On 2020-04-10 23:01, Dexuan Cui wrote: > > > Please let me know if the change to scsi_device_set_state() is OK. > > > > Hadn't Ming Lei already root-caused this issue for you? From his e-mail: > > "So you can't free related vmbus ringbuffer cause BLK_MQ_REQ_PREEMPT > > request is still to be handled." > > > > Please follow that advice. > > > > Bart. > > Hi Bart, Ming, > I agree Ming has root-caused the issue, but it looks the advice can not > apply to the hibernation scenario. :-) > > Sorry for my lack of knowledge of the complex SCSI subsystems -- could > you please elaborate on what a low level SCSI device driver (like hv_storvsc) > should do to safely save/restore the device state upon hibernation? > > The nature of "free related vmbus ringbuffer" in hv_storvsc is that: the > driver can not handle any I/O after the device is quiesced in > software_resume() -> load_image_and_restore() -> hibernation_restore() > -> dpm_suspend_start() -> ... -> storvsc_suspend(). BTW, after the SCSI > device is quiesced, the hibernation's resume path also quiesces other > devices, disables non-boot CPUs, and finally jumps to the old kernel's > entry point where the old kernel was suspended, and the old kernel will > resume back. > > My intuition is that the upper level SCSI layer should provide an API to > flush any pending I/O and block any new I/O after a SCSI device is > "quiesced"? -- it looks scsi_host_block()/scsi_host_unblock() are such > APIs, which are already used by > drivers/scsi/aacraid/linit.c: aac_suspend()/aac_resume(). > > That's why I proposed the patch of the same thing for hv_storvsc, and > it looks the patch works for me: without the patch I can easily hit the > panic I reported in the first email; with the patch, I have successfully > done more than 30 rounds of hibernation without the panic. > > However, it looks you implied my intuition is wrong and it's *expected* > that the upper level SCSI layer can still submit I/O requests with the > BLK_MQ_REQ_PREEMPT flag after the SCSI device is "quiesced"? > > If this is the case, then how is hv_storvsc supposed to handle the I/O > after the SCSI device is quiesced? I can keep the related vmbus ringbuffer, > but the real issue is: the driver is unable to handle any I/O at all since the > vmbus connection to the Hyper-V host is disconnected soon, after > the SCSI device is quiesced. Should hv_storvsc return an error for such I/O, > or block such I/O until the SCSI device is resumed? -- These don't look > good to me, and I really think the upper level SCSI layer should provide > an API to block any new I/O after a SCSI device is "quiesced" -- again, can > you please clarify if scsi_host_block()/scsi_host_unblock() are such APIs? > > Looking forward to your replies! > > Thanks, > -- Dexuan It looks scsi_host_block() and scsi_host_unblock() are just the APIs I need. I have run my hibernation test with the APIs more than 1000 rounds and the VM is still running fine without the panic I reported previously when I didn't use the APIs. FYI: the aacraid driver is the first user of the APIs: commit 3d3ca53b163914c1397289d0c2ee6d2f52362dcc Author: Hannes Reinecke <hare@xxxxxxx> Date: Fri Feb 28 08:53:14 2020 +0100 scsi: aacraid: use scsi_host_(block,unblock) to block I/O Use scsi_host_block() and scsi_host_unblock() instead of scsi_block_requests()/scsi_unblock_requests() to block and unblock I/O. This has the advantage that the block layer will stop sending I/O to the adapter instead of having the SCSI midlayer requeueing I/O internally. Link: https://lore.kernel.org/r/20200228075318.91255-10-hare@xxxxxxx Reviewed-by: Christoph Hellwig <hch@xxxxxx> Acked-by: Balsundar P < Balsundar.P@xxxxxxxxxxxxx> Signed-off-by: Hannes Reinecke <hare@xxxxxxx> Signed-off-by: Martin K. Petersen <martin.petersen@xxxxxxxxxx> BTW, I suspect aac_suspend() -> scsi_host_block() is not relly working because scsi_host_block() doesn't allow a state transition from SDEV_QUIESCE to SDEV_BLOCK and returns -EINVAL for that. aac_suspend() should check the return value of scsi_host_block(), and my change to scsi_device_set_state() should be needed to avoid the -EINVAL error. If there is no objection, I plan to send some patches later. Thanks, -- Dexuan