On Thu, Apr 01, 2010 at 01:11:49PM -0400, Neil Horman wrote: > On Thu, Apr 01, 2010 at 05:56:43PM +0200, Joerg Roedel wrote: > > The possible fix will be to enable the hardware earlier in the > > initialization path. > > > That sounds like a reasonable theory, I'll try hack something together > shortly. Great. So the problem might be already fixed when I am back in the office ;-) > > This would only prevent possible data corruption. When the IOMMU is off > > the devices will not get a target abort but will only write to different > > physical memory locations. The window where a target abort can happen > > starts when the kdump kernel re-enables the IOMMU and ends when the new > > driver for that device attaches. This is a small window but there is not > > a lot we can do to avoid this small time window. > > > Can you explain this a bit further please? From what I read, when the iommu is > disabled, AIUI it does no translations. That means that any dma addresses which > the driver mapped via the iommu prior to a crash that are stored in devices will > just get strobed on the bus without any translation. If those dma address do > not lay on top of any physical ram, won't that lead to bus errors, and > transaction aborts? Worse, if those dma addresses do lie on top of real > physical addresses, won't we get corruption in various places? Or am I missing > part of how that works? Hm, the device address may not be a valid host physical address, thats true. But the problem with the small time-window when the IOMMU hardware is re-programmed from the kdump kernel still exists. I need to think about other possible side-effects of leaving the IOMMU enabled on shutdown^Wboot into a kdump kernel. Joerg