On 26/09/17 13:21, Harsh Jain wrote: > Find attached new set of log. After repeated tries it panics. Thanks, that makes things a bit clearer - looks like fixing the physical address/pteval calculation to not be off by a page in one direction wasn't helping much because the returned DMA address is actually also off by a page in the other direction, and thus overflowing past the allocated IOVA into whoever else's mapping happened to be there; complete carnage ensues. After another look through the intel_map_sg() path, here's my second (still completely untested) guess at a possible fix. Robin. ----->8----- diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c index 6784a05dd6b2..d7f7def81613 100644 --- a/drivers/iommu/intel-iommu.c +++ b/drivers/iommu/intel-iommu.c @@ -2254,10 +2254,12 @@ static int __domain_mapping(struct dmar_domain *domain, unsigned long iov_pfn, uint64_t tmp; if (!sg_res) { + size_t off = sg->offset & ~PAGE_MASK; + sg_res = aligned_nrpages(sg->offset, sg->length); - sg->dma_address = ((dma_addr_t)iov_pfn << VTD_PAGE_SHIFT) + sg->offset; + sg->dma_address = ((dma_addr_t)iov_pfn << VTD_PAGE_SHIFT) + off; sg->dma_length = sg->length; - pteval = page_to_phys(sg_page(sg)) | prot; + pteval = (page_to_phys(sg_page(sg)) + sg->offset - off) | prot; phys_pfn = pteval >> VTD_PAGE_SHIFT; } > > > On 26-09-2017 09:16, Harsh Jain wrote: >> On 26-09-2017 00:16, Casey Leedom wrote: >>> | From: Raj, Ashok <ashok.raj@xxxxxxxxx> >>> | Sent: Monday, September 25, 2017 8:54 AM >>> | >>> | Not sure how the page->offset would end up being greater than page-size? >> Refer below >>> | >>> | If you have additional traces, please send them by. >>> | >>> | Is this a new driver? wondering how we didn't run into this? >>> >>> According to Herbert Xu and one of our own engineers, it's actually legal >>> for Scatter/Gather Lists to have this. This isn't my area of expertise >>> though so I'm just passing that on. >>> >>> I've asked our team to produce a detailed trace of the exact >>> Scatter/Gather Lists they're seeing and what ends up coming out of the DMA >>> Mappings, etc. They're in India, so I expect that they'll have this for you >>> by tomorrow morning. >> Below mentioned log was already there in 1st mail. Copied here for easy reference. Let me know if you need >> additional traces. >> >> 1) IN esp_output() "__skb_to_sgvec()" convert skb frags to scatter gather list. >> At that moment sg->offset was 4094. >> 2) From esp_output control reaches to "crypto_authenc_encrypt()". Here in >> "scatterwalk_ffwd()" sg->offset become 4110. >> 3) Same sg list received by chelsio crypto driver(chcr). When chcr try to do >> DMA mapping it starts giving DMA errors. >> >> Following error observed. first two prints are added for debugging in chcr. >> Kernel version used to reproduce is 4.9.28 on x86_64 with Page size 4K. >> >> Sep 15 12:40:52 heptagon kernel: process_cipher req src ffff8803cb41f0a8 >> Sep 15 12:40:52 heptagon kernel: ========= issue hit offset:4110 ======= >> dma_addr f24b000e ==> DMA mapped address returned by dma_map_sg() >> >> Sep 15 12:40:52 heptagon kernel: DMAR: DRHD: handling fault status reg 2 >> Sep 15 12:40:52 heptagon kernel: DMAR: [DMA Write] Request device [02:00.4] >> fault addr f24b0000 [fault reason 05] PTE Write access is not set >> >>> Casey >