Re: [PATCH 4/9] soc: apple: Add SART driver

Hector Martin <marcan@xxxxxxxxx> · Tue, 5 Apr 2022 00:01:09 +0900



On 04/04/2022 23.58, Rob Herring wrote:
> On Sat, Apr 02, 2022 at 09:07:17PM +0200, Arnd Bergmann wrote:
>> On Sat, Apr 2, 2022 at 2:38 PM Sven Peter <sven@xxxxxxxxxxxxx> wrote:
>>> On Mon, Mar 21, 2022, at 18:07, Arnd Bergmann wrote:
>>>> On Mon, Mar 21, 2022 at 5:50 PM Sven Peter <sven@xxxxxxxxxxxxx> wrote:
>>>>> The NVMe co-processor on the Apple M1 uses a DMA address filter called
>>>>> SART for some DMA transactions. This adds a simple driver used to
>>>>> configure the memory regions from which DMA transactions are allowed.
>>>>>
>>>>> Co-developed-by: Hector Martin <marcan@xxxxxxxxx>
>>>>> Signed-off-by: Hector Martin <marcan@xxxxxxxxx>
>>>>> Signed-off-by: Sven Peter <sven@xxxxxxxxxxxxx>
>>>>
>>>> Can you add some explanation about why this uses a custom interface
>>>> instead of hooking into the dma_map_ops?
>>>
>>> Sure.
>>> In a perfect world this would just be an IOMMU implementation but since
>>> SART can't create any real IOVA space using pagetables it doesn't fit
>>> inside that subsytem.
>>>
>>> In a slightly less perfect world I could just implement dma_map_ops here
>>> but that won't work either because not all DMA buffers of the NVMe
>>> device have to go through SART and those allocations happen
>>> inside the same device and would use the same dma_map_ops.
>>>
>>> The NVMe controller has two separate DMA filters:
>>>
>>>    - NVMMU, which must be set up for any command that uses PRPs and
>>>      ensures that the DMA transactions only touch the pages listed
>>>      inside the PRP structure. NVMMU itself is tightly coupled
>>>      to the NVMe controller: The list of allowed pages is configured
>>>      based on command's tag id and even commands that require no DMA
>>>      transactions must be listed inside NVMMU before they are started.
>>>    - SART, which must be set up for some shared memory buffers (e.g.
>>>      log messages from the NVMe firmware) and for some NVMe debug
>>>      commands that don't use PRPs.
>>>      SART is only loosely coupled to the NVMe controller and could
>>>      also be used together with other devices. It's also the only
>>>      thing that changed between M1 and M1 Pro/Max/Ultra and that's
>>>      why I decided to separate it from the NVMe driver.
>>>
>>> I'll add this explanation to the commit message.
>>
>> Ok, thanks.
>>
>>>>> +static void sart2_get_entry(struct apple_sart *sart, int index, u8 *flags,
>>>>> +                           phys_addr_t *paddr, size_t *size)
>>>>> +{
>>>>> +       u32 cfg = readl_relaxed(sart->regs + APPLE_SART2_CONFIG(index));
>>>>> +       u32 paddr_ = readl_relaxed(sart->regs + APPLE_SART2_PADDR(index));
>>>>
>>>> Why do you use the _relaxed() accessors here and elsewhere in the driver?
>>>
>>> This device itself doesn't do any DMA transactions so it needs no memory
>>> synchronization barriers. Only the consumer (i.e. rtkit and nvme) read/write
>>> from/to these buffers (multiple times) and they have the required barriers
>>> in place whenever they are used.
>>>
>>> These buffers so far are only allocated at probe time though so even using
>>> the normal writel/readl here won't hurt performance at all. I can just use
>>> those if you prefer or alternatively add a comment why _relaxed is fine here.
>>>
>>> This is a bit similar to the discussion for the pinctrl series last year [1].
>>
>> I think it's better to only use the _relaxed version where it actually helps,
>> with a comment about it, and use the normal version elsewhere, in
>> particular in functions that you have copied from the normal nvme driver.
>> I had tried to compare some of your code with the other version and
>> was rather confused by that.
> 
> Oh good, I tell folks the opposite (and others do too). We don't accept 
> random explicit barriers without explanation, but implicit ones are 
> okay? The resulting code on arm32 is also pretty horrible with the L2x0 
> and OMAP sync hooks not that that matters here.
> 
> I don't really care too much which way we go, but we should document one 
> rule and follow that.

I'm sure maz@ has an opinion here too :-)

(3... 2... 1... fight!)

-- 
Hector Martin (marcan@xxxxxxxxx)
Public Key: https://mrcn.st/pub