On 04/07/2015 05:08 PM, Dave Young wrote: > On 04/07/15 at 11:46am, Dave Young wrote: >> On 04/05/15 at 09:54am, Baoquan He wrote: >>> On 04/03/15 at 05:21pm, Dave Young wrote: >>>> On 04/03/15 at 05:01pm, Li, ZhenHua wrote: >>>>> Hi Dave, >>>>> >>>>> There may be some possibilities that the old iommu data is corrupted by >>>>> some other modules. Currently we do not have a better solution for the >>>>> dmar faults. >>>>> >>>>> But I think when this happens, we need to fix the module that corrupted >>>>> the old iommu data. I once met a similar problem in normal kernel, the >>>>> queue used by the qi_* functions was written again by another module. >>>>> The fix was in that module, not in iommu module. >>>> >>>> It is too late, there will be no chance to save vmcore then. >>>> >>>> Also if it is possible to continue corrupt other area of oldmem because >>>> of using old iommu tables then it will cause more problems. >>>> >>>> So I think the tables at least need some verifycation before being used. >>>> >>> >>> Yes, it's a good thinking anout this and verification is also an >>> interesting idea. kexec/kdump do a sha256 calculation on loaded kernel >>> and then verify this again when panic happens in purgatory. This checks >>> whether any code stomps into region reserved for kexec/kernel and corrupt >>> the loaded kernel. >>> >>> If this is decided to do it should be an enhancement to current >>> patchset but not a approach change. Since this patchset is going very >>> close to point as maintainers expected maybe this can be merged firstly, >>> then think about enhancement. After all without this patchset vt-d often >>> raised error message, hung. >> >> It does not convince me, we should do it right at the beginning instead of >> introduce something wrong. >> >> I wonder why the old dma can not be remap to a specific page in kdump kernel >> so that it will not corrupt more memory. But I may missed something, I will >> looking for old threads and catch up. > > I have read the old discussion, above way was dropped because it could corrupt > filesystem. Apologize about late commenting. > > But current solution sounds bad to me because of using old memory which is not > reliable. > > Thanks > Dave > Seems we do not have a better solution for the dmar faults. But I believe we can find out how to verify the iommu data which is located in old memory. Thanks Zhenhua