On Wed, 15 Jan 2025 02:37:58 -0800, Greg Kroah-Hartman wrote: > Hi Greg, > 6.12-stable review patch. If anyone has any objections, please let me know. > This is a note to let you know that I've just added the patch titled > > drm/xe/oa: Separate batch submission from waiting for completion > > to the 6.12-stable tree which can be found at: > http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=summary > > The filename of the patch is: > drm-xe-oa-separate-batch-submission-from-waiting-for.patch > and it can be found in the queue-6.12 subdirectory. > > If you, or anyone else, feels it should not be added to the stable tree, > please let <stable@xxxxxxxxxxxxxxx> know about it. I am writing about 3 emails I received (including this one) about 3 commits being added to stable. These are the 3 commits I am referring to (all commit SHA's refer to Linus' tree and first commit is at the bottom, everywhere in this email): 2fb4350a283af drm/xe/oa: Add input fence dependencies c8507a25cebd1 drm/xe/oa/uapi: Define and parse OA sync properties dddcb19ad4d4b drm/xe/oa: Separate batch submission from waiting for completion Apparently these are added to stable to avoid conflicts with this commit: f0ed39830e606 xe/oa: Fix query mode of operation for OAR/OAC However, the 3 commits are part of a 7 commit series and are incomplete in themeselves and will break userspace. So we should add the remaining 4 commits of the series to stable too. Thes are the ones: 85d3f9e84e062 drm/xe/oa: Allow only certain property changes from config 9920c8b88c5cf drm/xe/oa: Add syncs support to OA config ioctl cc4e6994d5a23 drm/xe/oa: Move functions up so they can be reused for config ioctl 343dd246fd9b5 drm/xe/oa: Signal output fences The above list can be generated by running the following in Linus' tree: git log --oneline -- drivers/gpu/drm/xe/xe_oa.c Thanks. -- Ashutosh > > ------------------ > > From: Ashutosh Dixit <ashutosh.dixit@xxxxxxxxx> > > [ Upstream commit dddcb19ad4d4bbe943a72a1fb3266c6e8aa8d541 ] > > When we introduce xe_syncs, we don't wait for internal OA programming > batches to complete. That is, xe_syncs are signaled asynchronously. In > anticipation for this, separate out batch submission from waiting for > completion of those batches. > > v2: Change return type of xe_oa_submit_bb to "struct dma_fence *" (Matt B) > v3: Retain init "int err = 0;" in xe_oa_submit_bb (Jose) > > Reviewed-by: Jonathan Cavitt <jonathan.cavitt@xxxxxxxxx> > Signed-off-by: Ashutosh Dixit <ashutosh.dixit@xxxxxxxxx> > Link: https://patchwork.freedesktop.org/patch/msgid/20241022200352.1192560-2-ashutosh.dixit@xxxxxxxxx > Stable-dep-of: f0ed39830e60 ("xe/oa: Fix query mode of operation for OAR/OAC") > Signed-off-by: Sasha Levin <sashal@xxxxxxxxxx> > --- > drivers/gpu/drm/xe/xe_oa.c | 57 +++++++++++++++++++++++++++++--------- > 1 file changed, 44 insertions(+), 13 deletions(-) > > diff --git a/drivers/gpu/drm/xe/xe_oa.c b/drivers/gpu/drm/xe/xe_oa.c > index 78823f53d290..4962c9eb9a81 100644 > --- a/drivers/gpu/drm/xe/xe_oa.c > +++ b/drivers/gpu/drm/xe/xe_oa.c > @@ -567,11 +567,10 @@ static __poll_t xe_oa_poll(struct file *file, poll_table *wait) > return ret; > } > > -static int xe_oa_submit_bb(struct xe_oa_stream *stream, struct xe_bb *bb) > +static struct dma_fence *xe_oa_submit_bb(struct xe_oa_stream *stream, struct xe_bb *bb) > { > struct xe_sched_job *job; > struct dma_fence *fence; > - long timeout; > int err = 0; > > /* Kernel configuration is issued on stream->k_exec_q, not stream->exec_q */ > @@ -585,14 +584,9 @@ static int xe_oa_submit_bb(struct xe_oa_stream *stream, struct xe_bb *bb) > fence = dma_fence_get(&job->drm.s_fence->finished); > xe_sched_job_push(job); > > - timeout = dma_fence_wait_timeout(fence, false, HZ); > - dma_fence_put(fence); > - if (timeout < 0) > - err = timeout; > - else if (!timeout) > - err = -ETIME; > + return fence; > exit: > - return err; > + return ERR_PTR(err); > } > > static void write_cs_mi_lri(struct xe_bb *bb, const struct xe_oa_reg *reg_data, u32 n_regs) > @@ -656,6 +650,7 @@ static void xe_oa_store_flex(struct xe_oa_stream *stream, struct xe_lrc *lrc, > static int xe_oa_modify_ctx_image(struct xe_oa_stream *stream, struct xe_lrc *lrc, > const struct flex *flex, u32 count) > { > + struct dma_fence *fence; > struct xe_bb *bb; > int err; > > @@ -667,7 +662,16 @@ static int xe_oa_modify_ctx_image(struct xe_oa_stream *stream, struct xe_lrc *lr > > xe_oa_store_flex(stream, lrc, bb, flex, count); > > - err = xe_oa_submit_bb(stream, bb); > + fence = xe_oa_submit_bb(stream, bb); > + if (IS_ERR(fence)) { > + err = PTR_ERR(fence); > + goto free_bb; > + } > + xe_bb_free(bb, fence); > + dma_fence_put(fence); > + > + return 0; > +free_bb: > xe_bb_free(bb, NULL); > exit: > return err; > @@ -675,6 +679,7 @@ static int xe_oa_modify_ctx_image(struct xe_oa_stream *stream, struct xe_lrc *lr > > static int xe_oa_load_with_lri(struct xe_oa_stream *stream, struct xe_oa_reg *reg_lri) > { > + struct dma_fence *fence; > struct xe_bb *bb; > int err; > > @@ -686,7 +691,16 @@ static int xe_oa_load_with_lri(struct xe_oa_stream *stream, struct xe_oa_reg *re > > write_cs_mi_lri(bb, reg_lri, 1); > > - err = xe_oa_submit_bb(stream, bb); > + fence = xe_oa_submit_bb(stream, bb); > + if (IS_ERR(fence)) { > + err = PTR_ERR(fence); > + goto free_bb; > + } > + xe_bb_free(bb, fence); > + dma_fence_put(fence); > + > + return 0; > +free_bb: > xe_bb_free(bb, NULL); > exit: > return err; > @@ -914,15 +928,32 @@ static int xe_oa_emit_oa_config(struct xe_oa_stream *stream, struct xe_oa_config > { > #define NOA_PROGRAM_ADDITIONAL_DELAY_US 500 > struct xe_oa_config_bo *oa_bo; > - int err, us = NOA_PROGRAM_ADDITIONAL_DELAY_US; > + int err = 0, us = NOA_PROGRAM_ADDITIONAL_DELAY_US; > + struct dma_fence *fence; > + long timeout; > > + /* Emit OA configuration batch */ > oa_bo = xe_oa_alloc_config_buffer(stream, config); > if (IS_ERR(oa_bo)) { > err = PTR_ERR(oa_bo); > goto exit; > } > > - err = xe_oa_submit_bb(stream, oa_bo->bb); > + fence = xe_oa_submit_bb(stream, oa_bo->bb); > + if (IS_ERR(fence)) { > + err = PTR_ERR(fence); > + goto exit; > + } > + > + /* Wait till all previous batches have executed */ > + timeout = dma_fence_wait_timeout(fence, false, 5 * HZ); > + dma_fence_put(fence); > + if (timeout < 0) > + err = timeout; > + else if (!timeout) > + err = -ETIME; > + if (err) > + drm_dbg(&stream->oa->xe->drm, "dma_fence_wait_timeout err %d\n", err); > > /* Additional empirical delay needed for NOA programming after registers are written */ > usleep_range(us, 2 * us); > -- > 2.39.5 > > >