Re: [PATCH] ACPI/IORT: Do not blindly trust DMA masks from firmware

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Robin,

On Thu, Jan 21, 2021 at 08:08:42PM +0000, Robin Murphy wrote:
> On 2021-01-21 19:16, Moritz Fischer wrote:
> > Address issue observed on real world system with suboptimal IORT table
> > where DMA masks of PCI devices would get set to 0 as result.
> > 
> > iort_dma_setup() would query the root complex' IORT entry for a DMA
> > mask, and use that over the one the device has been configured with
> > earlier.
> > 
> > Ideally we want to use the minimum mask of what the IORT contains for
> > the root complex and what the device was configured with, but never 0.
> > 
> > Fixes: 5ac65e8c8941 ("ACPI/IORT: Support address size limit for root complexes")
> > Signed-off-by: Moritz Fischer <mdf@xxxxxxxxxx>
> > ---
> > Hi all,
> > 
> > not sure I'm doing this right, but I think the current behavior (while a
> > corner case) seems to also fail for 32 bit devices if the IORT specifies
> > 64 bit. It works on my test system now with a 32 bit device.
> 
> I suppose it could go wrong if it's an old driver that doesn't explicitly
> set its own masks and assumes they will always be 32-bit. Technically we'd
> consider that the driver's fault these days, but there's a lot of legacy
> around still.

Huh, ok :) That's news to me. On my system I had three devices running
into this, so yeah I think it's quite common.

If that's the official stance I can send patches for the drivers in
question :)

> 
> > Open to suggestions for better solutions (and maybe the
> > nc_dma_get_range() should have the same sanity check?)
> 
> Honestly the more I come back to this, the more I think we should give up
> trying to be clever and just leave the default masks alone beyond the
> initial "is anything set up at all?" sanity checks. Setting the bus limit is
> what really matters these days, and should be sufficient to encode any
> genuine restriction. There's certainly no real need to widen the default
> masks above 32 bits just because firmware suggests so, since the driver
> should definitely be calling dma_set_mask() and friends later if it's
> >32-bit capable anyway.
> 
> > Thanks,
> > Moritz
> > 
> > ---
> >   drivers/acpi/arm64/iort.c | 11 ++++++++---
> >   1 file changed, 8 insertions(+), 3 deletions(-)
> > 
> > diff --git a/drivers/acpi/arm64/iort.c b/drivers/acpi/arm64/iort.c
> > index d4eac6d7e9fb..c48eabf8c121 100644
> > --- a/drivers/acpi/arm64/iort.c
> > +++ b/drivers/acpi/arm64/iort.c
> > @@ -1126,6 +1126,11 @@ static int rc_dma_get_range(struct device *dev, u64 *size)
> >   	rc = (struct acpi_iort_root_complex *)node->node_data;
> > +	if (!rc->memory_address_limit) {
> > +		dev_warn(dev, "Root complex has broken memory_address_limit\n");
> 
> Probably warrants a FW_BUG in there.
> 
> > +		return -EINVAL;
> > +	}
> > +
> >   	*size = rc->memory_address_limit >= 64 ? U64_MAX :
> >   			1ULL<<rc->memory_address_limit;
> > @@ -1172,9 +1177,9 @@ void iort_dma_setup(struct device *dev, u64 *dma_addr, u64 *dma_size)
> >   		 */
> >   		end = dmaaddr + size - 1;
> >   		mask = DMA_BIT_MASK(ilog2(end) + 1);
> > -		dev->bus_dma_limit = end;
> > -		dev->coherent_dma_mask = mask;
> > -		*dev->dma_mask = mask;
> > +		dev->bus_dma_limit = min_not_zero(dev->bus_dma_limit, end);
> 
> This doesn't need to change, since the default bus limit is 0 anyway (and
> that means "no limit").
Ok, I'll drop this.
> 
> > +		dev->coherent_dma_mask = min_not_zero(dev->coherent_dma_mask, mask);
> > +		*dev->dma_mask = min_not_zero(*dev->dma_mask, mask);

I'll keep those two?
> AFAICS the only way an empty mask could get here now is from
> nc_dma_get_range(), so I'd rather see a consistent warning there than just
> silently start working around that too.

In my case the empty mask came from the pci dev branch returning a size
of 1. (1 << 0).

I'll replace the dev_warn() with a pr_warn(FW_BUG ...) for both
{nc,rc}_dma_get_range() cases then?

> 
> Of course IORT doesn't say these fields are optional (other than the lack of
> a root complex limit in older table versions), so we're giving bad firmware
> a pass to never be fixed, ho hum...

I think if we yell loud enough (like FW_BUG) that'll get people's
attention?

Thanks for the quick reply
- Moritz



[Index of Archives]     [Linux IBM ACPI]     [Linux Power Management]     [Linux Kernel]     [Linux Laptop]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Samba]     [Video 4 Linux]     [Device Mapper]     [Linux Resources]

  Powered by Linux