On Fri, Jun 25, 2021 at 05:30:27PM +0800, Huacai Chen wrote: > Use separate remove()/shutdown() callback, and don't disable PCI device > during shutdown. This can avoid some poweroff/reboot failures. > > The poweroff/reboot failures could easily be reproduced on Loongson > platforms. I think this is not a Loongson-specific problem, instead, is > a problem related to some specific PCI hosts. On some x86 platforms, > radeon/amdgpu devices can cause the same problem [1][2], and commit > faefba95c9e8ca3a ("drm/amdgpu: just suspend the hw on pci shutdown") > can resolve it. > > As Tiezhu said, this occasionally shutdown or reboot failure is due to > clear PCI_COMMAND_MASTER on the device in do_pci_disable_device() [3]. > > static void do_pci_disable_device(struct pci_dev *dev) > { > u16 pci_command; > > pci_read_config_word(dev, PCI_COMMAND, &pci_command); > if (pci_command & PCI_COMMAND_MASTER) { > pci_command &= ~PCI_COMMAND_MASTER; > pci_write_config_word(dev, PCI_COMMAND, pci_command); > } > > pcibios_disable_device(dev); > } > > When remove "pci_command &= ~PCI_COMMAND_MASTER;", it can work well when > shutdown or reboot. The root cause on Loongson platform is that CPU is > still writing data to framebuffer while poweroff/reboot, and if we clear > Bus Master Bit at this time, CPU will wait ack from device, but never > return, so a hardware deadlock happens. Doesn't make sense yet. Bus Master enables the *device* to do DMA. A CPU can do MMIO to a device, e.g., to write data to a framebuffer, regardless of the state of Bus Master Enable. Also, those MMIO writes done by a CPU are Memory Write transactions on PCIe, which are "Posted" Requests, which means they do not receive acks. So this cannot be the root cause. Bjorn