On Tue, Jun 18, 2024 at 12:48:13PM +0300, Dmitry Baryshkov wrote: > On Tue, 18 Jun 2024 at 12:33, Daniel Vetter <daniel@xxxxxxxx> wrote: > > > > On Mon, Jun 17, 2024 at 10:52:27PM +0300, Dmitry Baryshkov wrote: > > > On Mon, Jun 17, 2024 at 11:28:35AM GMT, Abhinav Kumar wrote: > > > > Hi > > > > > > > > On 6/17/2024 9:54 AM, Brian Starkey wrote: > > > > > Hi, > > > > > > > > > > On Mon, Jun 17, 2024 at 05:16:36PM +0200, Daniel Vetter wrote: > > > > > > On Mon, Jun 17, 2024 at 01:41:59PM +0000, Hoosier, Matt wrote: > > > > > > > If/when we have hardware and driver support where you can use the > > > > > > writeback connector as a real-time streamout kind of thing, then we need > > > > > > to change all this, because with the current implementation, there's > > > > > > indeed the possibility that funny things can happen if you ignore the > > > > > > notice (funny as in data corruption, not funny as the kernel crashes of > > > > > > course). > > > > > > > > > > Indeed, the wording was added (from what I remember from so long > > > > > ago...) because it sounded like different HW made very different > > > > > guarantees/non-guarantees about what data would be written when, so > > > > > perhaps you'd end up with some pixels from the next frame in your > > > > > buffer or something. > > > > > > > > > > Taking Mali-DP/Komeda again, the writeback configuration is latched > > > > > along with everything else, and writeback throughput permitting, it > > > > > should "just work" if you submit a new writeback every frame. It > > > > > drains out the last of the data during vblank, before starting on the > > > > > next frame. That doesn't help the "general case" though. > > > > > > > > > > > > > Would it be fair to summarize it like below: > > > > > > > > 1) If the same CRTC is shared with the real time display, then the hardware > > > > is expected to fire this every frame so userspace should wait till this is > > > > signaled. > > > > > > As I wrote in response to another email in this thread, IMO existing > > > uAPI doesn't fully allow this. There is no way to enforce 'vblank' > > > handling onto the userspace. So userspace should be able to supply at > > > least two buffers and then after the vblank it should be able to enqueue > > > the next buffer, while the filled buffer is automatically dequeued by > > > the driver and is not used for further image output. > > > > Yeah if you want streaming writeback we need a queue depth of at least 2 > > in the kms api. Will help a lot on all hardware, but on some it's required > > because the time when the writeback buffer is fully flushed is after the > > point of no return for the next frame (which is when the vblank event is > > supposed to go out). > > > > I think over the years we've slowly inched forward to make at least the > > drm code safe for a queue depth of 2 in the atomic machinery, but the > > writeback and driver code probably needs a bunch of work. > > Do you mean handling the queue by allowing userspace to commit 'next' FB_ID? > > I was leaning towards extending the uAPI with something like explicit > WRITEBACK_FB_ID_QUEUED and WRITEBACK_OUT_FENCE_PTR_QUEUED properties. > This way once the fence has been reached, the drm_writeback might > automatically put the old framebuffer, move _QUEUED to normal props > and then signal the userspace. This way the single-frame writeback > drivers can support the old API, while allowing cloned-writeback > drivers to implement the streaming approach. Also, this allows drivers > to do clever tricks, like forbidding the _QUEUED operation if the > refresh rate for the writeback connector is too high. Eh I think we should just allow atomic commits with a queue depth > 1. I think that's both the cleanest uapi, and also the cleanest on the driver side, since I just don't want to think about what happens when we have multiple commits going on at the same time on the same crtc. On the core/helper side we've tried to get there slowly by explicitly accessing old/new state and never accessing plane/crtc->state directly. That's really all that should be needed. I guess to make things easier for drivers we could do an intermediate uapi where we allow queue depth, but only for commits that don't require a modeset, since those tend to be much simpler. Or maybe even allow the driver to fully control where it can handle a queue depth > 1. Essentially roughly this, minus all the safety checks we'll probably need to add in various places. Note: extremely incomplete :-) diff --git a/drivers/gpu/drm/drm_atomic_helper.c b/drivers/gpu/drm/drm_atomic_helper.c index 63ee81e478b9..31c4e124eb5a 100644 --- a/drivers/gpu/drm/drm_atomic_helper.c +++ b/drivers/gpu/drm/drm_atomic_helper.c @@ -2136,7 +2136,7 @@ static int stall_checks(struct drm_crtc *crtc, bool nonblock) spin_lock(&crtc->commit_lock); i = 0; list_for_each_entry(commit, &crtc->commit_list, commit_entry) { - if (i == 0) { + if (i == 1) { completed = try_wait_for_completion(&commit->flip_done); /* * Userspace is not allowed to get ahead of the previous @@ -2150,7 +2150,7 @@ static int stall_checks(struct drm_crtc *crtc, bool nonblock) return -EBUSY; } - } else if (i == 1) { + } else if (i == 2) { stall_commit = drm_crtc_commit_get(commit); break; } @@ -3015,7 +3015,7 @@ int drm_atomic_helper_swap_state(struct drm_atomic_state *state, struct drm_private_obj *obj; struct drm_private_state *old_obj_state, *new_obj_state; - if (stall) { + if (stall && 0) { /* * We have to stall for hw_done here before * drm_atomic_helper_wait_for_dependencies() because flip But I hope it sketches the idea at least. Cheers, Sima > > > > -Sima > > > > > > > > > > > > > 2) If a different CRTC is used for the writeback, then the composition loop > > > > for the real time display should not block on this unless its a mirroring > > > > use-case, then we will be throttled by the lowest refresh rate anyway. > > > > > > what is mirroring in this case? You have specified that a different CRTC > > > is being used. > > > > > > > > > > > > > > > > > > > If we already have devices where you can use writeback together with real > > > > > > outputs, then I guess that counts as an oopsie :-/ > > > > > > > > > > Well "works fine" fits into the "undefined behaviour" bucket, just as > > > > > well as "corrupts your fb" does :-) > > > -- > With best wishes > Dmitry -- Daniel Vetter Software Engineer, Intel Corporation http://blog.ffwll.ch