Re: IOMMU breakage on arm64

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Geert,

On 2018-11-06 7:44 pm, Geert Uytterhoeven wrote:
Hi Christoph et al,

On Tue, Oct 23, 2018 at 1:40 AM Linux Kernel Mailing List
<linux-kernel@xxxxxxxxxxxxxxx> wrote:
Commit:     b4ebe6063204da58e48600b810a97c29ae9e5d12
Parent:     7d21ee4c719f00896767ce19c4c01a56374c2ced
Refname:    refs/heads/master
Web:        https://git.kernel.org/torvalds/c/b4ebe6063204da58e48600b810a97c29ae9e5d12
Author:     Christoph Hellwig <hch@xxxxxx>
AuthorDate: Thu Sep 20 14:04:08 2018 +0200
Committer:  Christoph Hellwig <hch@xxxxxx>
CommitDate: Mon Oct 1 07:28:03 2018 -0700

     dma-direct: implement complete bus_dma_mask handling

     Instead of rejecting devices with a too small bus_dma_mask we can handle
     by taking the bus dma_mask into account for allocations and bounce
     buffering decisions.

     Signed-off-by: Christoph Hellwig <hch@xxxxxx>

I  have bisected the following crash to the above commit:

I think that commit mostly just changes the presentation of my underlying cockup - see here for what should fix it: https://patchwork.kernel.org/patch/10670177/

I have a feeling we've ironed out crash-on-early-domain-free bugs in the SMMU drivers already - arm-smmu certainly has an early return in arm_smmu_destroy_domain_context() which should behave exactly like your diff below, while I think arm-smmu-v3 gets away with it by virtue of smmu_domain->cfg being unset, but I'll double-check that when I'm fresh tomorrow (Jean-Philippe reported SMMUv3 hitting the DMA thing to me internally, but didn't mention any crash).

Thanks for the report,
Robin.

     ipmmu-vmsa e67b0000.mmu: Cannot accommodate DMA translation for
IOMMU page tables
     sata_rcar ee300000.sata: Unable to initialize IPMMU context
     iommu: Failed to add device ee300000.sata to group 0: -22
     Unable to handle kernel NULL pointer dereference at virtual
address 0000000000000038
     Mem abort info:
       ESR = 0x96000004
       Exception class = DABT (current EL), IL = 32 bits
       SET = 0, FnV = 0
       EA = 0, S1PTW = 0
     Data abort info:
       ISV = 0, ISS = 0x00000004
       CM = 0, WnR = 0
     [0000000000000038] user address but active_mm is swapper
     Internal error: Oops: 96000004 [#1] PREEMPT SMP
     CPU: 2 PID: 51 Comm: kworker/2:1 Not tainted
4.20.0-rc1-arm64-renesas-dirty #74
     Hardware name: Renesas Salvator-X 2nd version board based on
r8a7795 ES2.0+ (DT)
     Workqueue: events deferred_probe_work_func
     pstate: 60000005 (nZCv daif -PAN -UAO)
     pc : ipmmu_domain_free+0x1c/0xa0
     lr : ipmmu_domain_free+0x14/0xa0
     sp : ffff000009c9b990
     x29: ffff000009c9b990 x28: 0000000000000000
     x27: ffff000008dff000 x26: 0000000000000000
     x25: ffff000008dd9000 x24: 0000000000000014
     x23: ffff8006fffdbb20 x22: ffff000008dff000
     x21: ffff8006f898e680 x20: ffff8006f8fa6c00
     x19: ffff8006f8fa6c08 x18: 0000000000000037
     x17: 0000000000000020 x16: ffff000008a8f780
     x15: ffff8006fb096f10 x14: ffff8006f93731c8
     x13: 0000000000000000 x12: ffff8006fb096f10
     x11: ffff8006f9372fe8 x10: ffff000008dff708
     x9 : ffff8006fb096f50 x8 : ffff000008dff708
     x7 : ffff0000089f1858 x6 : 0000000000000000
     x5 : 0000000000000000 x4 : ffff8006f89a3000
     x3 : 00008006f7131000 x2 : ffff8006fb2b5700
     x1 : 0000000000000000 x0 : 0000000000000028
     Process kworker/2:1 (pid: 51, stack limit = 0x(____ptrval____))
     Call trace:
      ipmmu_domain_free+0x1c/0xa0
      iommu_group_release+0x48/0x68
      kobject_put+0x74/0xe8
      kobject_del.part.0+0x3c/0x50
      kobject_put+0x60/0xe8
      iommu_group_get_for_dev+0xa8/0x1f0
      ipmmu_add_device+0x1c/0x40
      of_iommu_configure+0x118/0x190
      of_dma_configure+0xcc/0x1f0
      platform_dma_configure+0x18/0x28
      really_probe+0x94/0x2a8
      driver_probe_device+0x58/0x100
      __device_attach_driver+0x90/0xd0
      bus_for_each_drv+0x64/0xc8
      __device_attach+0xd8/0x130
      device_initial_probe+0x10/0x18
      bus_probe_device+0x98/0xa0
      deferred_probe_work_func+0x74/0xb0
      process_one_work+0x294/0x6f0
      worker_thread+0x238/0x460
      kthread+0x120/0x128
      ret_from_fork+0x10/0x1c
     Code: aa0003f3 97ffeb1a d1002274 f85f8261 (f9401c20)
     ---[ end trace 4c46c7fd7cd07245 ]---

Reproducing on v4.20-rc1 requires CONFIG_IPMMU_VMSA=y, and whitelisting
SATA for IOMMU use (https://patchwork.kernel.org/patch/10544059/).
For older versions you also have to backport commit 3a0832d093693ede ("arm64:
dts: renesas: salvator-xs: enable SATA").

Actually there are two issues at hand:

1) drivers/iommu/io-pgtable-arm.c:__arm_lpae_alloc_pages() allocates
    memory (above 4 GiB) and maps it for DMA use, but it is rejected due
    to:

        dma_map_single(dev, pages, size, DMA_TO_DEVICE) != virt_to_phys(pages)

--- a/include/linux/dma-direct.h
+++ b/include/linux/dma-direct.h
@@ -27,7 +27,8 @@ static inline bool dma_capable(struct device *dev, dma_addr_t addr, size_t size)
         if (!dev->dma_mask)
                 return false;

-       return addr + size - 1 <= *dev->dma_mask;
+       return addr + size - 1 <=
+               min_not_zero(*dev->dma_mask, dev->bus_dma_mask);

        *dev->dma_mask = 0xffffffffff (40-bit)
        dev->bus_dma_mask = 0xffffffff (32-bit)

    Hence before, we had (in __arm_lpae_alloc_pages()):

        arm-lpae io-pgtable: pages = ffff8006f88f3000
        arm-lpae io-pgtable: dma = 0x00000007388f3000
        arm-lpae io-pgtable: virt_to_phys(pages) = 0x00000007388f3000

    After this change, we have:

        arm-lpae io-pgtable: pages = ffff8006f882b000
        arm-lpae io-pgtable: dma = 0x0000000074009000
        arm-lpae io-pgtable: virt_to_phys(pages) = 0x000000073882b000

    And SATA runs without using the IOMMU.

2) The Renesas IPMMU driver doesn't handle the above failure well,
    leading to a NULL pointer dereference.
    This can be fixed using the (gmail-whitespace-damaged) patch below:

--- a/drivers/iommu/ipmmu-vmsa.c
+++ b/drivers/iommu/ipmmu-vmsa.c
@@ -635,7 +635,8 @@ static void ipmmu_domain_free(struct iommu_domain
*io_domain)
          * been detached.
          */
         iommu_put_dma_cookie(io_domain);
-       ipmmu_domain_destroy_context(domain);
+       if (domain->mmu)
+               ipmmu_domain_destroy_context(domain);
         free_io_pgtable_ops(domain->iop);
         kfree(domain);
  }

    I expect drivers/iommu/arm-smmu-v3.c and drivers/iommu/arm-smmu.c
    need similar fixes.
    I didn't check all drivers, but e.g. drivers/iommu/amd_iommu.c has
a similar check.

Does the IOMMU work on other arm64 platforms in v4.20-rc1?

Thanks for your comments!

Gr{oetje,eeting}s,

                         Geert




[Index of Archives]     [Linux Samsung SOC]     [Linux Wireless]     [Linux Kernel]     [ATH6KL]     [Linux Bluetooth]     [Linux Netdev]     [Kernel Newbies]     [IDE]     [Security]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux ATA RAID]     [Samba]     [Device Mapper]

  Powered by Linux