On 13.03.2025 15:12, Robin Murphy wrote: > On 2025-03-13 1:06 pm, Robin Murphy wrote: >> On 2025-03-13 12:23 pm, Marek Szyprowski wrote: >>> On 13.03.2025 12:01, Robin Murphy wrote: >>>> On 2025-03-13 9:56 am, Marek Szyprowski wrote: >>>> [...] >>>>> This patch landed in yesterday's linux-next as commit bcb81ac6ae3c >>>>> ("iommu: Get DT/ACPI parsing into the proper probe path"). In my >>>>> tests I >>>>> found it breaks booting of ARM64 RK3568-based Odroid-M1 board >>>>> (arch/arm64/boot/dts/rockchip/rk3568-odroid-m1.dts). Here is the >>>>> relevant kernel log: >>>> >>>> ...and the bug-flushing-out begins! >>>> >>>>> Unable to handle kernel NULL pointer dereference at virtual address >>>>> 00000000000003e8 >>>>> Mem abort info: >>>>> ESR = 0x0000000096000004 >>>>> EC = 0x25: DABT (current EL), IL = 32 bits >>>>> SET = 0, FnV = 0 >>>>> EA = 0, S1PTW = 0 >>>>> FSC = 0x04: level 0 translation fault >>>>> Data abort info: >>>>> ISV = 0, ISS = 0x00000004, ISS2 = 0x00000000 >>>>> CM = 0, WnR = 0, TnD = 0, TagAccess = 0 >>>>> GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0 >>>>> [00000000000003e8] user address but active_mm is swapper >>>>> Internal error: Oops: 0000000096000004 [#1] PREEMPT SMP >>>>> Modules linked in: >>>>> CPU: 3 UID: 0 PID: 1 Comm: swapper/0 Not tainted 6.14.0-rc3+ #15533 >>>>> Hardware name: Hardkernel ODROID-M1 (DT) >>>>> pstate: 00400009 (nzcv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--) >>>>> pc : devm_kmalloc+0x2c/0x114 >>>>> lr : rk_iommu_of_xlate+0x30/0x90 >>>>> ... >>>>> Call trace: >>>>> devm_kmalloc+0x2c/0x114 (P) >>>>> rk_iommu_of_xlate+0x30/0x90 >>>> >>>> Yeah, looks like this is doing something a bit questionable which >>>> can't >>>> work properly. TBH the whole dma_dev thing could probably be >>>> cleaned up >>>> now that we have proper instances, but for now does this work? >>> >>> Yes, this patch fixes the problem I've observed. >>> >>> Reported-by: Marek Szyprowski <m.szyprowski@xxxxxxxxxxx> >>> Tested-by: Marek Szyprowski <m.szyprowski@xxxxxxxxxxx> >>> >>> BTW, this dma_dev idea has been borrowed from my exynos_iommu driver >>> and >>> I doubt it can be cleaned up. >> >> On the contrary I suspect they both can - it all dates back to when >> we had the single global platform bus iommu_ops and the SoC drivers >> were forced to bodge their own notion of multiple instances, but with >> the modern core code, ops are always called via a valid IOMMU >> instance or domain, so in principle it should always be possible to >> get at an appropriate IOMMU device now. IIRC it was mostly about >> allocating and DMA-mapping the pagetables in domain_alloc, where the >> private notion of instances didn't have enough information, but >> domain_alloc_paging solves that. > > Bah, in fact I think I am going to have to do that now, since although > it doesn't crash, rk_domain_alloc_paging() will also be failing for > the same reason. Time to find a PSU for the RK3399 board, I guess... > > (Or maybe just move the dma_dev assignment earlier to match Exynos?) Well I just found that Exynos IOMMU is also broken on some on my test boards. It looks that the runtime pm links are somehow not correctly established. I will try to analyze this later in the afternoon. Best regards -- Marek Szyprowski, PhD Samsung R&D Institute Poland