Vivek Goyal <vgoyal at redhat.com> writes: > On Fri, Aug 22, 2008 at 04:48:10PM -0700, Eric W. Biederman wrote: >> >> Hmm. Thinking about this we actually have 2 problems. >> - Communication about what is going on. >> - How to handle an iommu in the event of a crash dump scenario. >> >> The current solution is to ignore the iommu, and use swiotlb. This >> solution does not look like it will work for future iommus. >> > > Does setting up of swiotlb require iommu to be disabled in second kernel? Not precisely. But in a full iommu all accesses go through the iommu, and the iommu start becoming per bus. So in practice either we need to disable full iommu or work with them. > IOW, can swiotlb work reliably given the fact that iommu is active and > there are some active mappings (as created by first kernel). > > I am thinking is there a possibility that I set a DMA using swiotlb and the > physical address can overlap with IO address setup in IOMMU and that DMA might > go to a different buffer altogether. Yes. Which is why I would very much prefer to reserve some IOMMU entries. Instead of turning off an iommu altogether. >> The original plan (and it still sounds like a good one) was to reserve >> a section of the iommu (as we do for the physical memory). So we >> could have addresses that are only used for the crash dump kernel. Then >> have the crash dump kernel just use that section of the iommu. >> > > This would also require that second kernel keeps using first kernel's > iommu settings/tables and not try to initialize the iommu freshly. Not completely anyway. > One patch from Chandru is now mainline which seems to be solving the issue > for calgary IOMMU. He seems to be re-using first kernel's iommu tables > in second kernel hence avoiding re-initializing iommu and avoiding MCE. > > git commit 95b68dec0d52c7b8fea3698b3938cf3ab936436b > > This patch has the risk that second kernel might not find any free entries > to setup DMA and that's why reserving a section of iommu will help. Yes. That and we know there aren't any pending DMAs going to missetup entries. >> Either we need to do that or we need to disable the iommu, before we >> use swiotlb. >> > > I tought disabling iommu was not an option as it leads to MCE if there is > a DMA going on. Good point. Looks like I oversimplified. >> The problem is we can not reliably kill on-going DMA transactions >> at the time of a kernel panic, and likely doing so would greatly >> decrease our kernel reliability. > > May be re-using iommu tables in second kernel along with reserving some > entries for kdump is the way to go.. That is the best plan we have been able to come up with. Making AMD's iommu look more like a full strength iommu should help reinforce that model. Eric