On Thu, Apr 01, 2010 at 10:47:36AM -0400, Neil Horman wrote: > On Thu, Apr 01, 2010 at 04:29:02PM +0200, Joerg Roedel wrote: > > I am back in office next tuesday and will look into this problem too. > > > Thank you. Just took a look and I think the problem is that the devices are attached to domains before the IOMMU hardware is enabled. This happens in the function prealloc_protection_domains(). The attach code issues the dte-invalidate commands but they are not executed because the hardware is off. I will verify this when I have access to hardware again. The possible fix will be to enable the hardware earlier in the initialization path. > > Right. The default for all devices is to forbid DMA. > > > Thanks, glad to know I read that right, took me a bit to understand it :) I should probably add a comment :-) > > Thats indeed true. I have seen that with ixgbe cards for example. They > > seem to be really confused after an target abort. > > > Yeah, this part worries me, target aborts lead to various brain dead hardware > pieces. What are you thoughts on leaving the iommu on through a reboot to avoid > this issue (possibly resetting any pci device that encounters a target abort, as > noted in the error log on the iommu? This would only prevent possible data corruption. When the IOMMU is off the devices will not get a target abort but will only write to different physical memory locations. The window where a target abort can happen starts when the kdump kernel re-enables the IOMMU and ends when the new driver for that device attaches. This is a small window but there is not a lot we can do to avoid this small time window. Joerg