Re: [PATCH V3 2/3] PCI: rcar: Do not abort on too many inbound dma-ranges

Marek Vasut <marek.vasut@xxxxxxxxx> · Fri, 18 Oct 2019 20:44:26 +0200

On 10/18/19 7:35 PM, Robin Murphy wrote:
> On 18/10/2019 17:44, Marek Vasut wrote:
>> On 10/18/19 5:44 PM, Robin Murphy wrote:
>>> On 18/10/2019 15:26, Marek Vasut wrote:
>>>> On 10/18/19 2:53 PM, Robin Murphy wrote:
>>>>> On 18/10/2019 13:22, Marek Vasut wrote:
>>>>>> On 10/18/19 11:53 AM, Lorenzo Pieralisi wrote:
>>>>>>> On Thu, Oct 17, 2019 at 05:01:26PM +0200, Marek Vasut wrote:
>>>>>>>
>>>>>>> [...]
>>>>>>>
>>>>>>>>> Again, just handling the first N dma-ranges entries and
>>>>>>>>> ignoring the
>>>>>>>>> rest is not 'configure the controller correctly'.
>>>>>>>>
>>>>>>>> It's the best effort thing to do. It's well possible the next
>>>>>>>> generation
>>>>>>>> of the controller will have more windows, so could accommodate the
>>>>>>>> whole
>>>>>>>> list of ranges.
>>>>>
>>>>> In the context of DT describing the platform that doesn't make any
>>>>> sense. It's like saying it's fine for U-Boot to also describe a
>>>>> bunch of
>>>>> non-existent CPUs just because future SoCs might have them. Just
>>>>> because
>>>>> the system would probably still boot doesn't mean it's right.
>>>>
>>>> It's the exact opposite of what you just described -- the last release
>>>> of U-Boot currently populates a subset of the DMA ranges, not a
>>>> superset. The dma-ranges in the Linux DT currently are a superset of
>>>> available DRAM on the platform.
>>>
>>> I'm not talking about the overall coverage of addresses - I've already
>>> made clear what I think about that - I'm talking about the *number* of
>>> individual entries. If the DT binding defines that dma-ranges entries
>>> directly represent bridge windows, then the bootloader for a given
>>> platform should never generate more entries than that platform has
>>> actual windows, because to do otherwise would be bogus.
>>
>> I have a feeling that's not how Rob defined the dma-ranges in this
>> discussion though.
>>
>>>>>>>> Thinking about this further, this patch should be OK either way, if
>>>>>>>> there is a DT which defines more DMA ranges than the controller can
>>>>>>>> handle, handling some is better than failing outright -- a PCI
>>>>>>>> which
>>>>>>>> works with a subset of memory is better than PCI that does not work
>>>>>>>> at all.
>>>>>>>
>>>>>>> OK to sum it up, this patch is there to deal with u-boot adding
>>>>>>> multiple
>>>>>>> dma-ranges to DT.
>>>>>>
>>>>>> Yes, this patch was posted over two months ago, about the same time
>>>>>> this
>>>>>> functionality was posted for inclusion in U-Boot. It made it into
>>>>>> recent
>>>>>> U-Boot release, but there was no feedback on the Linux patch until
>>>>>> recently.
>>>>>>
>>>>>> U-Boot can be changed for the next release, assuming we agree on
>>>>>> how it
>>>>>> should behave.
>>>>>>
>>>>>>> I still do not understand the benefit given that for
>>>>>>> DMA masks they are useless as Rob pointed out and ditto for inbound
>>>>>>> windows programming (given that AFAICS the PCI controller filters
>>>>>>> out
>>>>>>> any transaction that does not fall within its inbound windows by
>>>>>>> default
>>>>>>> so adding dma-ranges has the net effect of widening the DMA'able
>>>>>>> address
>>>>>>> space rather than limiting it).
>>>>>>>
>>>>>>> In short, what's the benefit of adding more dma-ranges regions to
>>>>>>> the
>>>>>>> DT (and consequently handling them in the kernel) ?
>>>>>>
>>>>>> The benefit is programming the controller inbound windows correctly.
>>>>>> But if there is a better way to do that, I am open to implement that.
>>>>>> Are there any suggestions / examples of that ?
>>>>>
>>>>> The crucial thing is that once we improve the existing "dma-ranges"
>>>>> handling in the DMA layer such that it *does* consider multiple
>>>>> entries
>>>>> properly, platforms presenting ranges which don't actually exist will
>>>>> almost certainly start going wrong, and are either going to have to
>>>>> fix
>>>>> their broken bootloaders or try to make a case for platform-specific
>>>>> workarounds in core code.
>>>> Again, this is exactly the other way around, the dma-ranges
>>>> populated by
>>>> U-Boot cover only existing DRAM. The single dma-range in Linux DT
>>>> covers
>>>> even the holes without existing DRAM.
>>>>
>>>> So even if the Linux dma-ranges handling changes, there should be no
>>>> problem.
>>>
>>> Say you have a single hardware window, and this DT property (1-cell
>>> numbers for simplicity:
>>>
>>>      dma-ranges = <0x00000000 0x00000000 0x80000000>;
>>>
>>> Driver reads one entry and programs the window to 2GB@0, DMA setup
>>> parses the first entry and sets device masks to 0x7fffffff, and
>>> everything's fine.
>>>
>>> Now say we describe the exact same address range this way instead:
>>>
>>>      dma-ranges = <0x00000000 0x00000000 0x40000000,
>>>                0x40000000 0x40000000 0x40000000>;
>>>
>>> Driver reads one entry and programs the window to 1GB@0, DMA setup
>>> parses the first entry and sets device masks to 0x3fffffff, and *today*,
>>> things are suboptimal but happen to work.
>>>
>>> Now say we finally get round to fixing the of_dma code to properly
>>> generate DMA masks that actually include all usable address bits, a user
>>> upgrades their kernel package, and reboots with that same DT...
>>>
>>> Driver reads one entry and programs the window to 1GB@0, DMA setup
>>> parses all entries and sets device masks to 0x7fffffff, devices start
>>> randomly failing or throwing DMA errors half the time, angry user looks
>>> at the changelog to find that somebody decided their now-corrupted
>>> filesystem is less important than the fact that hey, at least the
>>> machine didn't refuse to boot because the DT was obviously wrong. Are
>>> you sure that shouldn't be a problem?
>>
>> I think you picked a rather special case here and arrived as a DMA mask
>> which just fails in this special case. Such special case doesn't happen
>> here, and even if it did, I would expect Linux to merge those two ranges
>> or do something sane ? If the DMA mask is set incorrectly, that's a bug
>> of the DMA code I would think.
> 
> The mask is not set incorrectly - DMA masks represent the number of
> address bits the device (or intermediate interconnect in the case of the
> bus mask) is capable of driving. Thus when DMA is limited to a specific
> address range, the masks should be wide enough to cover the topmost
> address of that range (unless the device's own capability is inherently
> narrower).

Then the mask should be 0x7fffffff in both cases I'd say.

>> What DMA mask would you get if those two entries had a gap inbetween
>> them ? E.g.:
>>
>>   dma-ranges = <0x00000000 0x00000000 0x20000000,
>>                 0x40000000 0x40000000 0x20000000>;
> 
> OK, here's an real non-simplified example

I would really like an answer to the simple example above before we
start inventing convoluted ones.

> (note that these windows are fixed and not programmed by Linux):
> 
>     dma-ranges = <0x02000000 0x0 0x2c1c0000 0x0 0x2c1c0000 0x0 0x00040000>,
>                  <0x02000000 0x0 0x80000000 0x0 0x80000000 0x0 0x80000000>,
>                  <0x43000000 0x8 0x80000000 0x8 0x80000000 0x2 0x00000000>;
> 
> The DMA masks for the devices behind this bridge *should* be 35 bits,
> because that's the size of the largest usable address. Currently,
> however, because of the of_dma code's deficiency they would end up being
> an utterly useless 30 bits, which isn't even enough to reach the start
> of RAM. Thus I can't actually have this property in my DT, and as a
> result I can't enable the IOMMU, because *that* also needs to know the
> ranges in order to reserve the unusable gaps between the windows once
> address translation is in play.

How is this related to this particular patch ? This looks more like some
DMA internal topic.

>>> Now, if you want to read the DT binding as less strict and let it just
>>> describe some arbitrarily-complex set of address ranges that should be
>>> valid for DMA, that's not insurmountable; you just need more complex
>>> logic in your driver capable of calculating how best to cover *all*
>>> those ranges using the available number of windows.
>>
>> That's what the driver does with this patchset, except it's not possible
>> to cover all those ranges. It covers them as well as it can.
> 
> Which means by definition it *doesn't* do what I suggested there...

But then it didn't do this before either, or ... ?

-- 
Best regards,
Marek Vasut