Hi Shameer,
[ 414.700493] specified DMA range outside IOMMU capability
<-- error here
[ 414.700496] Failed to set up IOMMU for device 0002:81:10.0; retaining
platform DMA ops <-- error here
Looks like this triggers the start of the bug.
So the below check in iommu_dma_init_domain fails,
if (domain->geometry.force_aperture) {
if (base > domain->geometry.aperture_end ||
base + size <= domain->geometry.aperture_start) {
and the rest goes out of sync after that. Can you print out the base,
aperture_start and end values to see why the check fails ?
dev_info(dev, "0x%llx 0x%llx, 0x%llx 0x%llx, 0x%llx 0x%llx\n", base, size, domain->geometry.aperture_start, domain->geometry.aperture_end, *dev->dma_mask, dev->coherent_dma_mask);
[ 183.752100] ixgbevf 0000:81:10.0: 0x0 0x100000000, 0x0 0xffffffffffff, 0xffffffff 0xffffffff
.....
[ 319.508037] vfio-pci 0000:81:10.0: 0x0 0x0, 0x0 0xffffffffffff, 0xffffffffffffffff 0xffffffffffffffff
Yes, size seems to be the problem here. When the VF device gets attached to vfio-pci,
somehow the dev->coherent_dma_mask is set to 64 bits and size become zero.
@@ -107,7 +107,7 @@ int of_dma_configure(struct device *dev, struct device_node *np)
ret = of_dma_get_range(np, &dma_addr, &paddr, &size);
if (ret < 0) {
dma_addr = offset = 0;
- size = ';
+ size = max(dev->coherent_dma_mask, dev->coherent_dma_mask + 1);
But without this series, size is still set as
dev->coherent_dma_mask + 1 , somehow not sure how it works
fine in that case ?
I remember i had this change in the V7 patchset, but later
dropped it for the reason that this change is not relevant
for this series, but should be there/sent to address the 64bit
overflow separately.
@@ -1386,7 +1387,8 @@ int acpi_dma_configure(struct device *dev, enum dev_dma_attr attr)
* Assume dma valid range starts at 0 and covers the whole
* coherent_dma_mask.
*/
- arch_setup_dma_ops(dev, 0, dev->coherent_dma_mask + 1, iommu,
+ size = max(dev->coherent_dma_mask, dev->coherent_dma_mask + 1);
+ arch_setup_dma_ops(dev, 0, size, iommu,
attr == DEV_DMA_COHERENT);
With the above fixes, DT boot works fine. But we still get the below crash on ACPI
[ 402.581445] kernel BUG at drivers/iommu/arm-smmu-v3.c:1064!
Looks like this happens when the ste_live becomes true during
the initial attach, but later without an detach,
attach once again happens from vfio. Just thinking why the
detach_dev is not happening only in ACPI case. Actually not
having any of arm-smmuv3 or ACPI based setup in my place.
Can i get some help by having some logs from the arm-smmv3
driver ?
[ 402.587007] Internal error: Oops - BUG: 0 [#1] PREEMPT SMP
[ 402.592479] Modules linked in: vfio_iommu_type1 vfio_pci irqbypass
vfio_virqfd vfio ixgbevf ixgb
The change that this series does is trying to add the dma/iommu ops to the
device after the iommu is actually probed.
So in your working case, does the device initially gets hooked to iommu_ops
and the above same check passes in working case ?
I believe so. Because didn't notice the "specified DMA range outside IOMMU capability"
in the working case.
ok, as i said above not sure why the overflow does not affect without
this series.
Regards,
Sricharan
Thanks,
Shameer
--
To unsubscribe from this list: send the line "unsubscribe linux-arm-msm" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
--
"QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a
member of Code Aurora Forum, hosted by The Linux Foundation