On Fri, Sep 23, 2022, at 2:56 PM, Linus Walleij wrote: > On Fri, Sep 23, 2022 at 1:58 PM Arnd Bergmann <arnd@xxxxxxxx> wrote: >> On Fri, Sep 23, 2022, at 1:38 PM, Linus Walleij wrote: > >> > A recent change affecting the behaviour of phys_to_dma() to >> > actually require the device tree ranges to work unmasked a >> > bug in the Integrator DMA ranges. >> > >> > The PL110 uses the CMA allocator to obtain coherent allocations >> > from a dedicated 1MB video memory, leading to the following >> > call chain: >> > >> > drm_gem_cma_create() >> > dma_alloc_attrs() >> > dma_alloc_from_dev_coherent() >> > __dma_alloc_from_coherent() >> > dma_get_device_base() >> > phys_to_dma() >> > translate_phys_to_dma() >> > >> > phys_to_dma() by way of translate_phys_to_dma() will nowadays not >> > provide 1:1 mappings unless the ranges are properly defined in >> > the device tree and reflected into the dev->dma_range_map. >> >> I don't understand this yet, what did the kernel previously >> do to that allowed the correct DMA mapping when a wrong >> address was set in the DT? > > Ah this goes way back. I think I need a different fixes tag. > I hadn't tested this platform in a while. > > What happens currently in translate_phys_to_dma() is that it returns > DMA_MAPPING_ERROR (0xffffffff) because of the buggy device tree > and that causes the problem. > > Before the dma direct rework we selected ARCH_HAS_PHYS_TO_DMA > and arch/arm/include/asm/dma-direct.h did this: > > static inline dma_addr_t pfn_to_dma(struct device *dev, unsigned long pfn) > { > if (dev) > pfn -= dev->dma_pfn_offset; > return (dma_addr_t)__pfn_to_bus(pfn); > } > (...) > static inline dma_addr_t __phys_to_dma(struct device *dev, phys_addr_t paddr) > { > unsigned int offset = paddr & ~PAGE_MASK; > return pfn_to_dma(dev, __phys_to_pfn(paddr)) + offset; > } > > As dma_pfn_offset was 0 this resulted in a 1:1 map. Ah, I see. So the offset was 0 because the dma-ranges did not actually point to RAM then, right? > Odd thing: this flag has no device tree bindings I could > find. Maybe it's in the really old DT docs? No idea. I do remember that in the old times, whether DMA was coherent or not was just a compile-time option, so ppc was usually assumed to be coherent while Arm was not, but then we needed something to identify machines that were different. > Here the special memory appears in the CPU address map. > > bus@c0000000 { > compatible = "arm,integrator-ap-lm"; > #address-cells = <1>; > #size-cells = <1>; > ranges = <0xc0000000 0xc0000000 0x40000000>; > dma-ranges; > (...) > lm0: bus@c0000000 { > compatible = "simple-bus"; > ranges = <0x00000000 0xc0000000 0x10000000>; > dma-ranges = <0x00000000 0xc0000000 0x10000000>; > reg = <0xc0000000 0x10000000>; > #address-cells = <1>; > #size-cells = <1>; > > display@1000000 { > compatible = "arm,pl110", "arm,primecell"; > reg = <0x01000000 0x1000>; > (...) > memory-region = <&impd1_ram>; > dma-ranges; > (...) > > Here we find the reg properly at physical address 0xc1000000 thanks to > the ranges: 0xc0000000 + 0x00000000 + 0x01000000 = 0xc1000000. > > But the special memory region is in the root of the device tree, > outside of the translation. > > Now dma-ranges will assume that when we translate the memory region > it is in the address space of the display controller, but it isn't, > because the phandle goes to something in the root of the device tree. > > 0xc2000000 + 0x00000000 + 0x00000000 = 0xc2000000 > > Whenever I do ranges my head start spinning :/ > > If you have a better way to accomodate the DMA ranges I am happy to > comply! I'm still not sure I understand it. Is that reserved memory area in RAM or on the logic bus? The 'ranges' property makes it appear that all of 0xc0000000-0xcfffffff is the logic bus, but then the vram itself is also in that address range, which would indicate that it is behind that bus. However, using the 'reserved-memory' property indicates that this is actually just normal RAM that is carved out from what the OS manages. Arnd