Re: [PATCH 1/6] drm/ttm: move the LRU into resource handling v4

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Am 23.03.22 um 14:36 schrieb Daniel Vetter:
On Wed, 23 Mar 2022 at 13:20, Christian König
<ckoenig.leichtzumerken@xxxxxxxxx> wrote:
Am 23.03.22 um 12:59 schrieb Daniel Vetter:
On Mon, Mar 21, 2022 at 02:25:56PM +0100, Christian König wrote:
This way we finally fix the problem that new resource are
not immediately evict-able after allocation.

That has caused numerous problems including OOM on GDS handling
and not being able to use TTM as general resource manager.

v2: stop assuming in ttm_resource_fini that res->bo is still valid.
v3: cleanup kerneldoc, add more lockdep annotation
v4: consistently use res->num_pages

Signed-off-by: Christian König <christian.koenig@xxxxxxx>
Tested-by: Bas Nieuwenhuizen <bas@xxxxxxxxxxxxxxxxxxx>
+/**
+ * struct ttm_lru_bulk_move
+ *
+ * @tt: first/last lru entry for resources in the TT domain
+ * @vram: first/last lru entry for resources in the VRAM domain
+ *
+ * Helper structure for bulk moves on the LRU list.
+ */
+struct ttm_lru_bulk_move {
+    struct ttm_lru_bulk_move_pos tt[TTM_MAX_BO_PRIORITY];
+    struct ttm_lru_bulk_move_pos vram[TTM_MAX_BO_PRIORITY];
Not really needed, just a thought: Should we track the associated dma_resv
object here to make sure the locking is all done correctly (and also check
that the bulk move bo have the same dma_resv)? It wouldn't really be any
overhead for the !CONFIG_LOCKDEP case and we could sprinkle a lot more
dma_resv_held all over the place.
You made a similar comment on the last revision and I already tried to
play around with that idea a bit.

But I've completely abandoned that idea after realizing that the BOs in
the bulk move actually don't need to have the same dma_resv object, nor
do they all need to be locked.
Uh how does that work? If you evict that bo while someone else is
doing a bulk move, then at least the result might be rather funny, and
I have no idea how it could work.

The LRU is still protected by the common spinlock.

So that will synchronize any modification to both the bulk move structure as well as the individual list_heads making up the LRU.


Like even if you then make the rule that you have to lock all bos for
the bulk move, the bo could still be moved independently, and that
would again break the bulk lru properties.

And if you do none of the above, there's no reason for that bo to have
a distinct dma_resv.

Like maybe the data structure wont fall apart, but semantically it
just doesn't make any sense to me to allow this. What would you want
to use this for?

Yeah, that's a good point.

It's not technically necessary as far as I can see, but I'm not sure if there is a valid use case either.

It just happens that amdgpu is currently using it that way, but I can't
see any technical necessarily to restrict the bulk move like that.
Yeah we can do that later on in a follow up patch, or I figure out why
it's not a good idea :-) Just figured this might be good to lock down
before other drivers start adopting this.

I'm just wondering if it's really more defensive to restrict the handling like that.

On the other hand we can still lift the restriction when anybody really comes along with a valid use case.

Christian.

-Daniel

Regards,
Christian.


-Daniel





[Index of Archives]     [Linux DRI Users]     [Linux Intel Graphics]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [XFree86]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Linux Kernel]     [Linux SCSI]     [XFree86]
  Powered by Linux