Re: [PATCH 4/7] drm/radeon: Pin buffers while they are vmap'ed

Thomas Zimmermann <tzimmermann@xxxxxxx> · Wed, 25 Nov 2020 12:38:01 +0100

Hi

Am 25.11.20 um 11:36 schrieb Daniel Vetter:
On Wed, Nov 25, 2020 at 11:13:13AM +0100, Christian König wrote:
Am 25.11.20 um 09:37 schrieb Thomas Zimmermann:
Hi

Am 24.11.20 um 15:09 schrieb Daniel Vetter:
On Tue, Nov 24, 2020 at 02:56:51PM +0100, Thomas Zimmermann wrote:
Hi

Am 24.11.20 um 14:36 schrieb Christian König:
Am 24.11.20 um 13:15 schrieb Thomas Zimmermann:
[SNIP]
First I wanted to put this into
drm_gem_ttm_vmap/vunmap(), but then wondered why
ttm_bo_vmap() doe not acquire the lock internally?
I'd expect that vmap/vunmap are close together and
do not overlap for the same BO.

We have use cases like the following during command submission:

1. lock
2. map
3. copy parts of the BO content somewhere else or patch
it with additional information
4. unmap
5. submit BO to the hardware
6. add hardware fence to the BO to make sure it doesn't move
7. unlock

That use case won't be possible with vmap/vunmap if we
move the lock/unlock into it and I hope to replace the
kmap/kunmap functions with them in the near term.

Otherwise, acquiring the reservation lock would
require another ref-counting variable or per-driver
code.

Hui, why that? Just put this into
drm_gem_ttm_vmap/vunmap() helper as you initially
planned.

Given your example above, step one would acquire the lock,
and step two would also acquire the lock as part of the vmap
implementation. Wouldn't this fail (At least during unmap or
unlock steps) ?

Oh, so you want to nest them? No, that is a rather bad no-go.

I don't want to nest/overlap them. My question was whether that
would be required. Apparently not.

While the console's BO is being set for scanout, it's protected from
movement via the pin/unpin implementation, right?

Yes, correct.

The driver does not acquire the resv lock for longer periods. I'm
asking because this would prevent any console-buffer updates while
the console is being displayed.

Correct as well, we only hold the lock for things like command
submission, pinning, unpinning etc etc....

Thanks for answering my questions.

You need to make sure that the lock is only taken from the FB
path which wants to vmap the object.

Why don't you lock the GEM object from the caller in the generic
FB implementation?

With the current blitter code, it breaks abstraction. if vmap/vunmap
hold the lock implicitly, things would be easier.

Do you have a link to the code?

It's the damage blitter in the fbdev code. [1] While it flushes
the shadow
buffer into the BO, the BO has to be kept in place. I already
changed it to
lock struct drm_fb_helper.lock, but I don't think this is
enough. TTM could
still evict the BO concurrently.

So I'm not sure this is actually a problem: ttm could try to
concurrently
evict the buffer we pinned into vram, and then just skip to the next
one.

Plus atm generic fbdev isn't used on any chip where we really care about
that last few mb of vram being useable for command submission (well atm
there's no driver using it).

Well, this is the patchset for radeon. If it works out, amdgpu and
nouveau are natural next choices. Especially radeon and nouveau support
cards with low- to medium-sized VRAM. The MiBs wasted on fbdev certainly
matter.

Having the buffer pinned into system memory and trying to do a
concurrent
modeset that tries to pull it in is the hard failure mode. And holding
fb_helper.lock fully prevents that.

So not really clear on what failure mode you're seeing here?

Imagine the fbdev BO is in VRAM, but not pinned. (Maybe Xorg or Wayland
is running.) The fbdev BO is a few MiBs and not in use, so TTM would
want to evict it if memory gets tight.

What I have in mind is a concurrent modeset that requires the memory. If
we do a concurrent damage blit without protecting against eviction,
things go boom. Same for concurrent 3d graphics with textures, model
data, etc.

Completely agree.

This needs proper lock protection of the memory mapped buffer. Relying on
that some other code isn't run because we have some third part locks taken
is not sufficient here.

We are still protected by the pin count in this scenario. Plus, with
current drivers we always pin the fbdev buffer into vram, so occasionally
failing to move it out isn't a regression.

Why is this protected by the pin count? The counter should be zero in 
this scenario. Otherwise, we could not evict the fbdev BO on all those 
systems where that's a hard requirement (e.g., ast).

The pin count is currently maintained by the vmap implementation in vram 
helpers. Calling vmap is an implicit pin; calling vunmap is an implicit 
unpin. This prevents eviction in the damage worker. But now I was told 
than pinning is only for BOs that are controlled by userspace and 
internal users should acquire the resv lock. So vram helpers have to be 
fixed, actually.

In vram helpers, unmapping does not mean eviction. The unmap operation 
only marks the BO as unmappable. The real unmap happens when the 
eviction takes place. This avoids many modifications to the page tables. 
IOW an unpinned, unmapped BO will remain in VRAM until the memory is 
actually needed.

Best regards
Thomas

So I'm still not seeing how this can go boom.

Now long term it'd be nice to cut everything over to dma_resv locking, but
the issue there is that beyond ttm, none of the helpers (and few of the
drivers) use dma_resv. So this is a fairly big uphill battle. Quick
interim fix seems like the right solution to me.
-Daniel

Regards,
Christian.

Best regards
Thomas

There's no recursion taking place, so I guess the reservation
lock could be
acquired/release in drm_client_buffer_vmap/vunmap(), or a
separate pair of
DRM client functions could do the locking.

Given how this "do the right locking" is a can of worms (and I think
it's
worse than what you dug out already) I think the fb_helper.lock hack is
perfectly good enough.

I'm also somewhat worried that starting to use dma_resv lock in generic
code, while many helpers/drivers still have their hand-rolled locking,
will make conversion over to dma_resv needlessly more complicated.
-Daniel

Best regards
Thomas

[1] https://cgit.freedesktop.org/drm/drm-tip/tree/drivers/gpu/drm/drm_fb_helper.c?id=ac60f3f3090115d21f028bffa2dcfb67f695c4f2#n394

Please note that the reservation lock you need to take here is part of
the GEM object.

Usually we design things in the way that the code needs to take a lock
which protects an object, then do some operations with the object and
then release the lock again.

Having in the lock inside the operation can be done as well, but
returning with it is kind of unusual design.

Sorry for the noob questions. I'm still trying to understand the
implications of acquiring these locks.

Well this is the reservation lock of the GEM object we are
talking about
here. We need to take that for a couple of different operations,
vmap/vunmap doesn't sound like a special case to me.

Regards,
Christian.

Best regards
Thomas

_______________________________________________
dri-devel mailing list
dri-devel@xxxxxxxxxxxxxxxxxxxxx
https://lists.freedesktop.org/mailman/listinfo/dri-devel

--
Thomas Zimmermann
Graphics Driver Developer
SUSE Software Solutions Germany GmbH
Maxfeldstr. 5, 90409 Nürnberg, Germany
(HRB 36809, AG Nürnberg)
Geschäftsführer: Felix Imendörffer

--
Thomas Zimmermann
Graphics Driver Developer
SUSE Software Solutions Germany GmbH
Maxfeldstr. 5, 90409 Nürnberg, Germany
(HRB 36809, AG Nürnberg)
Geschäftsführer: Felix Imendörffer
Attachment:
OpenPGP_0x680DC11D530B7A23.asc

Description: application/pgp-keys
Attachment:
OpenPGP_signature

Description: OpenPGP digital signature
_______________________________________________
dri-devel mailing list
dri-devel@xxxxxxxxxxxxxxxxxxxxx
https://lists.freedesktop.org/mailman/listinfo/dri-devel