On Fri Apr 30, 2021 at 3:51 PM CDT, Bjorn Helgaas wrote: > Please make your subject line match ffb0863426eb ("PCI: Disable > Samsung SM961/PM961 NVMe before FLR") Understood, I will send a revision ASAP. > There's always the possibility that we are doing something wrong in > Linux *after* the FLR, e.g., not waiting long enough, not > reinitializing something correctly, etc. In my experience I was not able to get my particular drive to enter this state while issuing various types of resets purely from the Linux host. The issue only appeared when I pass the device to a KVM guest *and allow that guest to cleanly shut-down.* The last part is crucial: if the guest is forcibly powered off Linux was able to reset the drive just fine. So I suspect the issue here is related to the interaction between whatever state the guest leaves the NVMe drive in, and the Linux kernel's own reset code triggering some pathological behavior in the controller. FWIW even a remove/rescan, with an interim suspend to RAM, was not enough to unfreeze the controller. The only way I've found to get the device back (apart from this patch) was a full reboot.