On Fri, 2010-07-30 at 13:08 -0700, Eric W. Biederman wrote: > The issue is what happens if you take an IOMMU page fault during > between shutdown and restart. I seem to remember an IOMMU page fault > triggering a machine check on AMD cpus. So maybe it works but my gut > impression is simply leaving the IOMMU in a state that is on but not > responding could actually make a reboot or kexec less stable than having > on-going DMAs stomping on memory. If you can leave it on, without > translations and not trapping to software that is a different story. Speaking of the Intel IOMMU, I know nothing of any 'on but not responding' state. You have: - 'off', which gives a 1:1 mapping and thus if you do this during kexec any still-running devices could be scribbling *anywhere* in memory, using their previously-allocated virtual DMA addresses which are now interpreted as physical addresses. - 'on with page tables cleared', in which case you are safe but some devices might get upset when their DMA is aborted, so their driver needs not to be a pile of shit, and needs to recover from that. - 'on and we preserve the virt->phys mappings of the previous kernel', which is just crack-inspired. You'd have to find the physical pages which were mapped by the previous kernel and steal them away from the new kernel's memory map, just in case they get scribbled on by a device which hasn't been properly shut down by the previous kernel, through the still-extant DMA mappings. I mention the latter only because it's been suggested by someone who was dealing with a broken driver/hardware combination where it *didn't* get properly reset after a fault, even when the driver was loaded anew. Not because anyone in their right mind would ever *do* it. -- dwmw2