Hi, Thanks for the review! On Mon, Mar 21, 2022, at 18:07, Arnd Bergmann wrote: > On Mon, Mar 21, 2022 at 5:50 PM Sven Peter <sven@xxxxxxxxxxxxx> wrote: >> >> The NVMe co-processor on the Apple M1 uses a DMA address filter called >> SART for some DMA transactions. This adds a simple driver used to >> configure the memory regions from which DMA transactions are allowed. >> >> Co-developed-by: Hector Martin <marcan@xxxxxxxxx> >> Signed-off-by: Hector Martin <marcan@xxxxxxxxx> >> Signed-off-by: Sven Peter <sven@xxxxxxxxxxxxx> > > Can you add some explanation about why this uses a custom interface > instead of hooking into the dma_map_ops? Sure. In a perfect world this would just be an IOMMU implementation but since SART can't create any real IOVA space using pagetables it doesn't fit inside that subsytem. In a slightly less perfect world I could just implement dma_map_ops here but that won't work either because not all DMA buffers of the NVMe device have to go through SART and those allocations happen inside the same device and would use the same dma_map_ops. The NVMe controller has two separate DMA filters: - NVMMU, which must be set up for any command that uses PRPs and ensures that the DMA transactions only touch the pages listed inside the PRP structure. NVMMU itself is tightly coupled to the NVMe controller: The list of allowed pages is configured based on command's tag id and even commands that require no DMA transactions must be listed inside NVMMU before they are started. - SART, which must be set up for some shared memory buffers (e.g. log messages from the NVMe firmware) and for some NVMe debug commands that don't use PRPs. SART is only loosely coupled to the NVMe controller and could also be used together with other devices. It's also the only thing that changed between M1 and M1 Pro/Max/Ultra and that's why I decided to separate it from the NVMe driver. I'll add this explanation to the commit message. > >> +static void sart2_get_entry(struct apple_sart *sart, int index, u8 *flags, >> + phys_addr_t *paddr, size_t *size) >> +{ >> + u32 cfg = readl_relaxed(sart->regs + APPLE_SART2_CONFIG(index)); >> + u32 paddr_ = readl_relaxed(sart->regs + APPLE_SART2_PADDR(index)); > > Why do you use the _relaxed() accessors here and elsewhere in the driver? This device itself doesn't do any DMA transactions so it needs no memory synchronization barriers. Only the consumer (i.e. rtkit and nvme) read/write from/to these buffers (multiple times) and they have the required barriers in place whenever they are used. These buffers so far are only allocated at probe time though so even using the normal writel/readl here won't hurt performance at all. I can just use those if you prefer or alternatively add a comment why _relaxed is fine here. This is a bit similar to the discussion for the pinctrl series last year [1]. > >> +struct apple_sart *apple_sart_get(struct device *dev) >> +{ >> + struct device_node *sart_node; >> + struct platform_device *sart_pdev; >> + struct apple_sart *sart; >> + >> + sart_node = of_parse_phandle(dev->of_node, "apple,sart", 0); >> + if (!sart_node) >> + return ERR_PTR(ENODEV); > > The error pointers need to take negative values, like 'ERR_PTR(-ENODEV)', > here and everywhere else in the driver. Ouch, that's my second bug of that kind in the past days. I'll fix it here and check the other patches in this series as well. Thanks, Sven [1] https://lore.kernel.org/lkml/87sfz8zdzb.wl-maz@xxxxxxxxxx/