On 9/26/2019 4:20 AM, Robin Murphy wrote: > On 2019-09-26 11:44 am, Nicolas Saenz Julienne wrote: >>>>>> Robin, have you looked into supporting multiple dma-ranges? It's the >>>>>> next thing >>>>>> we need for BCM STB's PCIe. I'll have a go at it myself if nothing >>>>>> is in >>>>>> the >>>>>> works already. >>>>> >>>>> Multiple dma-ranges as far as configuring inbound windows should work >>>>> already other than the bug when there's any parent translation. But if >>>>> you mean supporting multiple DMA offsets and masks per device in the >>>>> DMA API, there's nothing in the works yet. >> >> Sorry, I meant supporting multiple DMA offsets[1]. I think I could >> still make >> it with a single DMA mask though. > > The main problem for supporting that case in general is the disgusting > carving up of the physical memory map you may have to do to guarantee > that a single buffer allocation cannot ever span two windows with > different offsets. I don't think we ever reached a conclusion on whether > that was even achievable in practice. It is with the Broadcom STB SoCs which have between 1 and 3 memory controllers depending on the SoC, and multiple dma-ranges cells for PCIe as a consequence. Each memory controller has a different physical address aperture in the CPU's physical address map (e.g.: MEMC0 is 0x0 - 0x3fff_ffff, MEMC1 0x4000_0000 - 0x7ffff_ffff and MEMC2 0x8000_0000 - 0xbfff_ffff, not counting the extension regions above 4GB), and while the CPU is scheduled and arbitrated the same way across all memory controllers (thus making it virtually UMA, almost) having a buffer span two memory controllers would be problematic because the memory controllers do not know how to guarantee the transaction ordering and buffer data consistency in both DRAM itself and for other memory controller clients, like PCIe. We historically had to reserve the last 4KB of each memory controller to avoid problematic controllers like EHCI to prefetch beyond the end of a memory controller's populated memory and that also incidentally takes care of never having a buffer cross a controller boundary. Either you can allocate the entire buffer on a given memory controller, or you cannot allocate memory at all on that zone/region and another one must be found (or there is no more memory and there is a genuine OOM). The way we reserve memory right now is based on the first patch submitted by Jim: https://lore.kernel.org/patchwork/patch/988469/ whereby we read the memory node's "reg" property and we map the physical addresses to the memory controller configuration read from the specific registers in the CPU's Bus Interface Unit (where the memory controller apertures are architecturally defined) and then we use that to call memblock_reserve() (not part of that patch, it should be though). -- Florian