On Tue, Sep 26, 2017 at 12:01:03PM +0800, Ming Lei wrote: > On Mon, Sep 25, 2017 at 11:09:15PM +0000, Bart Van Assche wrote: > > On Tue, 2017-09-26 at 07:04 +0800, Ming Lei wrote: > > > On Mon, Sep 25, 2017 at 01:29:18PM -0700, Bart Van Assche wrote: > > > > Some people use the md driver on laptops and use the suspend and > > > > resume functionality. Since it is essential that submitting of > > > > new I/O requests stops before device quiescing starts, make the > > > > md resync and reshape threads freezable. > > > > > > As I explained, if SCSI quiesce is safe, this patch shouldn't > > > be needed. > > > > > > The issue isn't MD specific, and in theory can be triggered > > > on all devices. And you can see the I/O hang report on BTRFS(RAID) > > > without MD involved: > > > > > > https://marc.info/?l=linux-block&m=150634883816965&w=2 > > > > What makes you think that this patch is not necessary once SCSI quiesce > > has been made safe? Does this mean that you have not tested suspend and > > If we want to make SCSI quiesce safe, we have to drain up all submitted > I/O and prevent new I/O from being submitted, that is enough to deal > with MD's resync too. > > > resume while md RAID 1 resync was in progress? This patch is necessary > > to avoid that suspend locks up while md RAID 1 resync is in progress. > > I tested my patchset on RAID10 when resync in progress, not see any > issue during suspend/resume, without any MD's change. I will test > RAID1's later, but I don't think there is difference compared with > RAID10 because my patchset can make the queue being quiesced totally. I am pretty sure that suspend/resume can survive when resync in progress with my patchset applied on RAID1, without any MD change. There are reports on suspend/resume on btrfs(RAID) and revalidate path in scsi_transport_spi device, so the issue isn't MD specific again. If your patchset depends on this MD change, something should be wrong in the following patches. Now I need to take a close look. -- Ming