* David Woodhouse (dwmw2@xxxxxxxxxxxxx) wrote: > On Fri, 2011-11-11 at 15:49 -0700, Alex Williamson wrote: > > To fix this, switch domain_update_iommu_coherency() to use the > > safer, non-coherent default for domains not attached to iommus. > > That isn't a fix for the problem you described. > > The problem is that changing a domain from coherent to non-coherent is > *broken*. It probably needs to flush the cache for the *entire* set of > page tables — not just the new context entry it adds. For a guest domain, the page tables aren't actually changing. And for the snoop mode change, we remap the pages. > You might have removed the *common* case where we trigger that bug, but > it certainly isn't a fix. > > However, I'd be receptive to an argument that the situation you describe > is in fact the *only* time we'd have to switch from coherent to > non-coherent at run time, because the coherency is an all-or-nothing > characteristic of the chipset. Either all the IOMMUs are coherent, or > none of them, right? This brain-damage only affects the first chipsets > from before we worked out that cache incoherency was a *really* f*cking > stupid idea, doesn't it? Dunno if it exists going forward (I've stopped being surprised by the brain damage in this area ;), but those machines are still out there. > So if you were to ditch the whole idea of a per-domain runtime update, > and instead calculate a global value for 'iommu_coherency' at boot time, > by iterating over for_each_active_iommu()¹, I think that would be a > better way to deal with the issue. And you *could* really call that a > 'fix'. > > Make sense? Ideally, yes. Not sure we can practically do it though. Would have to be sure we force incoherent access mode for the busted hw. thanks, -chris -- To unsubscribe from this list: send the line "unsubscribe linux-pci" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html