On 01/15/20 at 02:17pm, Khalid Aziz wrote: > On 1/15/20 11:05 AM, Kairui Song wrote: > > On Thu, Jan 16, 2020 at 1:31 AM Khalid Aziz <khalid@xxxxxxxxxxxxxx> wrote: > >> > >> On 1/13/20 10:07 AM, Kairui Song wrote: > >>> On Sun, Jan 12, 2020 at 2:33 AM Deepa Dinamani <deepa.kernel@xxxxxxxxx> wrote: > >>>> > >>>>> Hi, there are some previous works about this issue, reset PCI devices > >>>>> in kdump kernel to stop ongoing DMA: > >>>>> > >>>>> [v7,0/5] Reset PCIe devices to address DMA problem on kdump with iommu > >>>>> https://lore.kernel.org/patchwork/cover/343767/ > >>>>> > >>>>> [v2] PCI: Reset PCIe devices to stop ongoing DMA > >>>>> https://lore.kernel.org/patchwork/patch/379191/ > >>>>> > >>>>> And didn't get merged, that patch are trying to fix some DMAR error > >>>>> problem, but resetting devices is a bit too destructive, and the > >>>>> problem is later fixed in IOMMU side. And in most case the DMA seems > >>>>> harmless, as they targets first kernel's memory and kdump kernel only > >>>>> live in crash memory. > >>>> > >>>> I was going to ask the same. If the kdump kernel had IOMMU on, would > >>>> that still be a problem? > >>> > >>> It will still fail, doing DMA is not a problem, it only go wrong when > >>> a device's upstream bridge is mistakenly shutdown before the device > >>> shutdown. > >>> > >>>> > >>>>> Also, by the time kdump kernel is able to scan and reset devices, > >>>>> there are already a very large time window where things could go > >>>>> wrong. > >>>>> > >>>>> The currently problem observed only happens upon kdump kernel > >>>>> shutdown, as the upper bridge is disabled before the device is > >>>>> disabledm so DMA will raise error. It's more like a problem of wrong > >>>>> device shutting down order. > >>>> > >>>> The way it was described earlier "During this time, the SUT sometimes > >>>> gets a PCI error that raises an NMI." suggests that it isn't really > >>>> restricted to kexec/kdump. > >>>> Any attached device without an active driver might attempt spurious or > >>>> malicious DMA and trigger the same during normal operation. > >>>> Do you have available some more reporting of what happens during the > >>>> PCIe error handling? > >>> > >>> Let me add more info about this: > >>> > >>> On the machine where I can reproduce this issue, the first kernel > >>> always runs fine, and kdump kernel works fine during dumping the > >>> vmcore, even if I keep the kdump kernel running for hours, nothing > >>> goes wrong. If there are DMA during normal operation that will cause > >>> problem, this should have exposed it. > >>> > >> > >> This is the part that is puzzling me. Error shows up only when kdump > >> kernel is being shut down. kdump kernel can run for hours without this > >> issue. What is the operation from downstream device that is resulting in > >> uncorrectable error - is it indeed a DMA request? Why does that > >> operation from downstream device not happen until shutdown? > >> > >> I just want to make sure we fix the right problem in the right way. > >> > > > > Actually the device could keep sending request with no problem during > > kdump kernel running. Eg. keep sending DMA, and all DMA targets first > > kernel's system memory, so kdump runs fine as long as nothing touch > > the reserved crash memory. And the error is reported by the port, when > > shutdown it has bus master bit, and downstream request will cause > > error. > > > > Problem really is there are active devices while kdump kernel is > running. You did say earlier - "And in most case the DMA seems > harmless, as they targets first kernel's memory and kdump kernel only > live in crash memory.". Even if this holds today, it is going to break > one of these days. There is the "reset_devices" option but that does not > work if driver is not loaded by kdump kernel. Can we try to shut down > devices in machine_crash_shutdown() before we start kdump kernel? It is not a good idea :) We do not add extra logic after a panic because the kernel is not stable and we want a correct vmcore. Similar suggestions had been rejected a lot of times.. Thanks Dave