Re: Adreno crash on i.MX53 running 5.3-rc6

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 05/09/2019 20:05, Rob Clark wrote:
On Thu, Sep 5, 2019 at 10:03 AM Rob Clark <robdclark@xxxxxxxxx> wrote:

On Wed, Sep 4, 2019 at 11:06 AM Robin Murphy <robin.murphy@xxxxxxx> wrote:

On 04/09/2019 01:12, Rob Clark wrote:
On Tue, Sep 3, 2019 at 12:31 PM Fabio Estevam <festevam@xxxxxxxxx> wrote:

Hi Jonathan,

On Tue, Sep 3, 2019 at 4:25 PM Jonathan Marek <jonathan@xxxxxxxx> wrote:

Hi,

I tried this and it works with patches 4+5 from Rob's series and
changing gpummu to use sg_phys(sg) instead of sg->dma_address
(dma_address isn't set now that dma_map_sg isn't used).

Thanks for testing it. I haven't had a chance to test it yet.

Rob,

I assume your series is targeted to 5.4, correct?

maybe, although Christoph Hellwig didn't seem like a big fan of
exposing cache ops, and would rather add a new allocation API for
uncached pages.. so I'm not entirely sure what the way forward will
be.

TBH, the use of map/unmap looked reasonable in the context of
"start/stop using these pages for stuff which may include DMA", so even
if it was cheekily ignoring sg->dma_address I'm not sure I'd really
consider it "abuse" - in comparison, using sync without a prior map
unquestionably violates the API, and means that CONFIG_DMA_API_DEBUG
will be rendered useless with false positives if this driver is active
while trying to debug something else.

The warning referenced in 0036bc73ccbe represents something being
unmapped which didn't match a corresponding map - from what I can make
of get_pages()/put_pages() it looks like that would need msm_obj->flags
or msm_obj->sgt to change during the lifetime of the object, neither of
which sounds like a thing that should legitimately happen. Are you sure
this isn't all just hiding a subtle bug elsewhere? After all, if what
was being unmapped wasn't right, who says that what's now being synced is?


Correct, msm_obj->flags/sgt should not change.

I reverted the various patches, and went back to the original setup
that used dma_{map,unmap}_sg() to reproduce the original issue that
prompted the change in the first place.  It is a pretty massive flood
of splats, which pretty quickly overflowed the dmesg ring buffer, so I
might be missing some things, but I'll poke around some more.

The one thing I wonder about, what would happen if the buffer is
allocated and dma_map_sg() called before drm/msm attaches it's own
iommu_domains, and then dma_unmap_sg() afterwards.  We aren't actually
ever using the iommu domain that DMA API is creating for the device,
so all the extra iommu_map/unmap (and tlb flush) is at best
unnecessary.  But I'm not sure if it could be having some unintended
side effects that cause this sort of problem.

Right, one of the semi-intentional side-effects of 43c5bf11a610 is that iommu-dma no longer interferes with unmanaged domains - it will still go and make its own redundant mappings in the unattached default domain, but as long as the DMA API usage is fundamentally sound then it shouldn't actually get in the way.
it seems like every time (or at least every time we splat), we end up
w/ iova=fffffffffffff000 .. which doesn't sound likely to be right.
Although from just looking at the dma-iommu.c code, I'm not sure how
this happens.  And adding some printk's results in enough traces that
I can't boot for some reason..

Yeah, that's a bogus IOVA for sure, so regardless of how we actually make Adreno happy it would still be interesting to figure out how it came about. Do you see any WARNs from io-pgtable-arm before the one from __iommu_dma_unmap()?

Robin.
_______________________________________________
dri-devel mailing list
dri-devel@xxxxxxxxxxxxxxxxxxxxx
https://lists.freedesktop.org/mailman/listinfo/dri-devel




[Index of Archives]     [Linux DRI Users]     [Linux Intel Graphics]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [XFree86]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Linux Kernel]     [Linux SCSI]     [XFree86]
  Powered by Linux