Re: [PATCH 1/5] drm/vmwgfx: unbind in vmw_ttm_unpopulate

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Am 26.07.21 um 22:03 schrieb Dave Airlie:
On Tue, 27 Jul 2021 at 05:35, Christian König
<ckoenig.leichtzumerken@xxxxxxxxx> wrote:
Am 26.07.21 um 02:03 schrieb Dave Airlie:
[SNIP]
But you know, normal human: Only equipped with one head and two hands
and not cloneable.
I'm the same, but I'm not seeing where this problem happens at all, do
we have any backtraces or demos for this?
I've stumbled over this while working on some patches which accidentally
broke delayed delete and caused random memory corruption and was
wondering how that even happened in the first place.

Having stale PTEs in the GART isn't a problem unless you break other
things. Which is also the reason why I haven't added a CC stable yet.

I split bind/unbind into the driver, but the driver should still
always be moving things to unbound states before an unpopulate is
called, is there a driver doing something unexpected here?
Possible, I was only testing with amdgpu and the GART handling is rather
special there (which was one of the reasons why we did that in the first
place).

at worst I'd like to see a WARN_ON put in first and a test in igt that
triggers it, because right now I'm not see that path through the
drivers/ttm that leads to unpopulated pages ending up happening while
bound.

  From 5.14-rc3 unpopulate is called from ttm_bo_pipeline_gutting in
non-ghost path and there is no unbind,
pipeline gutting is called from evict/validate, when there is no
placement suggested for the object, is this case getting hit somewhere
without the driver having previously unbound things?
Yes, that will hit absolutely. Most likely with VM page tables for
amdgpu which uses this.
My thing is here we moved binding/unbinding to the driver, if the
driver has a bug
I'd expect the fix to be in the driver side here. I think giving
drivers control over something
and enforcing it in the core/helpers is fine, but I don't think we
should be adding back
the midlayering.

Ok, then we are pretty much already on the same page here.

I've just reverted the patch because I thought it would be cleaner for eventually backporting it.




ttm_tt_destroy_common: calls unpopulate, everyone calls this directly
after unbinding
ttm_tt_swapout: calls unpopulate, we don't swapout objects from TT
directly, we should always unbind first, this used to have an assert
against that,
ttm_tt_populate: call unpopulate in failure path
unbinding was moved into the driver, so ttm_tt_swapout() won't unbind
anything before calling unpopulate as far as I can see.
But we won't call swapout on BOs that aren't in SYSTEM and to be in SYSTEM,
the bo would have to go through the driver move function which will unbind it.

Ah, good point.


So any place I can see unpopulate getting called with a bound TT
should be a bug, and fixed, we could protect against it better but I'm
not seeing the need for this series to outright revert things back as
helping.
I'm not reverting this because it is offhand wrong, but making sure the
GART is clear before unpopulating the TT object sounds like the much
more natural thing to do and the state machine is something I certainly
don't see in the backend.

I don't think that's the core's responsibility anymore, I'm fine with
adding code and
even an flag that says if the the TT is bound/unbound that unpopulate
can WARN_ON()
against so we get a backtrace and track down where something is
getting unpopulated
without going through the driver move function to be unbound.

I was not talking about bound/unbound, but rather unpopulating before destroying.

The requirement we have is that the tt object is unpopulated before it is destroyed and the state machine of that object is essentially controlled by its BO and not the object itself.

I will prepare a patch making that cleaner.

Thanks,
Christian.


Dave.




[Index of Archives]     [Linux DRI Users]     [Linux Intel Graphics]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [XFree86]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Linux Kernel]     [Linux SCSI]     [XFree86]
  Powered by Linux