> From: Bart Van Assche <bvanassche@xxxxxxx> > Sent: Tuesday, April 21, 2020 10:02 PM > On 4/21/20 5:17 PM, Dexuan Cui wrote: > > During hibernation, the sdevs are suspended automatically in > > drivers/scsi/scsi_pm.c before storvsc_suspend(), so after > > storvsc_suspend(), there is no disk I/O from the file systems, but there > > can still be disk I/O from the kernel space, e.g. disk_check_events() -> > > sr_block_check_events() -> cdrom_check_events() can still submit I/O > > to the storvsc driver, which causes a panic of NULL pointer dereference, > > since storvsc has closed the vmbus channel in storvsc_suspend(): refer > > to the below links for more info: > > ... > > Fix the panic by blocking/unblocking all the I/O queues properly. > > > > Note: this patch depends on another patch "scsi: core: Allow the state > > change from SDEV_QUIESCE to SDEV_BLOCK" (refer to the second link > above). > > I don't like this patch. It makes the behavior of the storsvc driver > different from every other SCSI LLD. Other SCSI LLDs don't need this > change because these don't destroy I/O queues upon suspend. > > Bart. Upon suspend, I suppose the other LLDs can not accept I/O either, then what do they do when some kernel thread still tries to submit I/O? Do they "block" the I/O until resume, or just return an error immediately? I had a look at drivers/scsi/xen-scsifront.c. It looks this LLD implements a mechanism of marking the device busy upon suspend, so any I/O will immediately get an error SCSI_MLQUEUE_HOST_BUSY. IMO the disadvantage is: the mechanism of marking the device busy looks complex, and the hot path .queuecommand() has to take the spinlock shost->host_lock, which should affect the performance. It looks drivers/scsi/nsp32.c: nsp32_suspend() and drivers/scsi/3w-9xxx.c: twa_suspend() do nothing to handle new I/O after suspend. I doubt this is correct. It looks drivers/scsi/hisi_sas/hisi_sas_v3_hw.c: hisi_sas_v3_suspend() tries to use scsi_block_requests() as a means of blocking new I/O, but I doubt it can work: scsi_block_requests() only sets a variable -- how could this prevent another concurrent kernel thread to submit new I/O without a race condition issue? So it looks to me there is no simple mechanism to handle the scenario here, and I guess that's why the scsi_host_block/unblock APIs are introduced, and actually there is already an user of the APIs: 3d3ca53b1639 ("scsi: aacraid: use scsi_host_(block,unblock) to block I/O"). The aacraid patch says: "This has the advantage that the block layer will stop sending I/O to the adapter instead of having the SCSI midlayer requeueing I/O internally". It looks this may imply that using the new APIs is encouraged? Sorry for my lack of knowledge in Linux SCSI system, but IMHO the new APIs are simple and effective, and they really look like a good fit for the hibernation scenario. BTW, storvsc_suspend() is not a hot path here, so IMO I don't have any performance concern. PS, here storvsc has to destroy and re-construct the I/O queues: the I/O queues are shared memory ringbufers between the guest and the host; in the resume path of the hibernation procedure, the memory pages allocated by the 'new' kernel is different from that allocated by the 'old' kernel, so before jumping to the 'old' kernel, the 'new' kernel must destroy the mapping of the pages, and later after we jump to the 'old' kernel, we'll re-create the mapping using the pages allocated by the 'old' kernel. Here "create the mapping" means the guest tells the host about the physical addresses of the pages. Thanks, -- Dexuan