On 11/30/16 at 05:53pm, Baoquan He wrote: > On 11/30/16 at 05:03pm, Baoquan He wrote: > > On 11/30/16 at 04:15pm, Xunlei Pang wrote: > > > On 11/29/2016 at 10:35 PM, Joerg Roedel wrote: > > > > On Thu, Nov 17, 2016 at 10:47:28AM +0800, Xunlei Pang wrote: > > > >> As per the comment, the code here only needs to flush context caches > > > >> for the special domain 0 which is used to tag the > > > >> non-present/erroneous caches, seems we should flush the old domain id > > > >> of present entries for kdump according to the analysis, other than the > > > >> new-allocated domain id. Let me ponder more on this. > > > > Flushing the context entry only is fine. The old domain-id will not be > > > > re-used anyway, so there is no point in reading it out of the context > > > > table and flush it. > > > > > > Do you mean to flush the context entry using the new-allocated domain id? > > > > > > Yes, old domain-id will not be re-used as they were reserved when copy, but > > > may still be cached by in-flight DMA access. > > > > Joerg is saying you have flushed context entry which is the ingress, > > new DMA can't get an entrance to hit the iotlb accordingly. Since you > > have bolted the ingress gate. I guess > OK, talked with Xunlei. The old cache could be entry with present bit set. > And please code comment at the bottom of iommu_init_domains(), you can > see domain 0 is a special domain id. > > > ~~~~~~~~~~~~~~~~~~~~~~~~~ > /* > * If Caching mode is set, then invalid translations are tagged > * with domain-id 0, hence we need to pre-allocate it. We also > * use domain-id 0 as a marker for non-allocated domain-id, so > * make sure it is not used for a real domain. > */ > set_bit(0, iommu->domain_ids); > ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ > > And in vt-d spec, at the end of section 6.2.2 and the following > sections, you can see domain 0 is used to tag the cached entry. > > I guess that's why it works with only domain 0 specified. The simple > thing to verify that is you specify another did, E.g 100 for your > flushing, see if it still works. > > > So, if it's just as above, v1 should be good enough. > > Besides, you should use translation_pre_enabled(). If 1st kernel add > intel_iommu=off, no need to do this. > > Thanks > Baoquan > > > > > > > > Here is what the things seem to be from my understanding, and why I want to > > > flush using the old domain id: > > > 1) In kdump mode, old tables are copied, and all the iommu caches are flushed. > > > 2) There comes some in-flight DMA before the device's new context is mapped, > > > so translation caches(context, iotlb, etc) are created tagging old domain-id > > > in the iommu hardware. > > > 3) At the driver probe stage, the device is reset , and no in-flight DMA will exist. > > > Here I assumed that the device reset won't flush the old caches in the iommu > > > hardware related to this device. I haven't found any relevant specification, please > > > correct me if I am wrong. > > > 4) Then new context is setup, and new DMA is initiated, hit old cache that was > > > created in 2) as currently there's no such flush action, so DMAR fault happens. > > > > > > I already posted v2 to flush context/iotlb using the old domain-id: > > > https://lkml.org/lkml/2016/11/18/514 > > > > > > Regards, > > > Xunlei > > > > > > > > > > > Also, please add a Fixes-tag when you re-post this patch. > > > > > > > > > > > > Joerg > > > > > > > > > > > > > _______________________________________________ > > > kexec mailing list > > > kexec at lists.infradead.org > > > http://lists.infradead.org/mailman/listinfo/kexec