On Mon, 13 Jan 2025 06:03:53 -0800, Sasha Levin wrote: > Hello Sasha, > This is a note to let you know that I've just added the patch titled > > drm/xe/oa: Separate batch submission from waiting for completion > > to the 6.12-stable tree which can be found at: > http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=summary > > The filename of the patch is: > drm-xe-oa-separate-batch-submission-from-waiting-for.patch > and it can be found in the queue-6.12 subdirectory. > > If you, or anyone else, feels it should not be added to the stable tree, > please let <stable@xxxxxxxxxxxxxxx> know about it. I am writing about 3 emails I received (including this one) about 3 commits being added to stable. These are the 3 commits I am referring to (all commit SHA's refer to Linus' tree and first commit is at the bottom, everywhere in this email): 2fb4350a283af drm/xe/oa: Add input fence dependencies c8507a25cebd1 drm/xe/oa/uapi: Define and parse OA sync properties dddcb19ad4d4b drm/xe/oa: Separate batch submission from waiting for completion So none of these commits had any "Fixes:" tag or "Cc: stable" so not sure why they are being added to stable. Also, they are part of a 7 commit series so not sure why only 3 of the 7 commits are being added to stable? In addition, for this commit which is also added to stable: f0ed39830e606 xe/oa: Fix query mode of operation for OAR/OAC We modified this commit for stable because it will otherwise with the previous 3 commits mentioned above, which we were assuming would not be in stable. Now, if we want all these commits in stable (I actually prefer it), we should just pick them straight from Linus' tree. This would be all these commits: f0ed39830e606 xe/oa: Fix query mode of operation for OAR/OAC c0403e4ceecae drm/xe/oa: Fix "Missing outer runtime PM protection" warning 85d3f9e84e062 drm/xe/oa: Allow only certain property changes from config 9920c8b88c5cf drm/xe/oa: Add syncs support to OA config ioctl cc4e6994d5a23 drm/xe/oa: Move functions up so they can be reused for config ioctl 343dd246fd9b5 drm/xe/oa: Signal output fences 2fb4350a283af drm/xe/oa: Add input fence dependencies c8507a25cebd1 drm/xe/oa/uapi: Define and parse OA sync properties dddcb19ad4d4b drm/xe/oa: Separate batch submission from waiting for completion So: we should either drop the 3 patches at the top, or just pick all 9 patches above. Doing the latter will probably be the simplest and I don't expect any conflicts, or if there are, I can help to resolve those. The above list can be generated by running the following in Linus' tree: git log --oneline -- drivers/gpu/drm/xe/xe_oa.c Thanks. -- Ashutosh > > > > commit 9aeced687e728b9de067a502a0780f8029e61763 > Author: Ashutosh Dixit <ashutosh.dixit@xxxxxxxxx> > Date: Tue Oct 22 13:03:46 2024 -0700 > > drm/xe/oa: Separate batch submission from waiting for completion > > [ Upstream commit dddcb19ad4d4bbe943a72a1fb3266c6e8aa8d541 ] > > When we introduce xe_syncs, we don't wait for internal OA programming > batches to complete. That is, xe_syncs are signaled asynchronously. In > anticipation for this, separate out batch submission from waiting for > completion of those batches. > > v2: Change return type of xe_oa_submit_bb to "struct dma_fence *" (Matt B) > v3: Retain init "int err = 0;" in xe_oa_submit_bb (Jose) > > Reviewed-by: Jonathan Cavitt <jonathan.cavitt@xxxxxxxxx> > Signed-off-by: Ashutosh Dixit <ashutosh.dixit@xxxxxxxxx> > Link: https://patchwork.freedesktop.org/patch/msgid/20241022200352.1192560-2-ashutosh.dixit@xxxxxxxxx > Stable-dep-of: f0ed39830e60 ("xe/oa: Fix query mode of operation for OAR/OAC") > Signed-off-by: Sasha Levin <sashal@xxxxxxxxxx> > > diff --git a/drivers/gpu/drm/xe/xe_oa.c b/drivers/gpu/drm/xe/xe_oa.c > index 78823f53d290..4962c9eb9a81 100644 > --- a/drivers/gpu/drm/xe/xe_oa.c > +++ b/drivers/gpu/drm/xe/xe_oa.c > @@ -567,11 +567,10 @@ static __poll_t xe_oa_poll(struct file *file, poll_table *wait) > return ret; > } > > -static int xe_oa_submit_bb(struct xe_oa_stream *stream, struct xe_bb *bb) > +static struct dma_fence *xe_oa_submit_bb(struct xe_oa_stream *stream, struct xe_bb *bb) > { > struct xe_sched_job *job; > struct dma_fence *fence; > - long timeout; > int err = 0; > > /* Kernel configuration is issued on stream->k_exec_q, not stream->exec_q */ > @@ -585,14 +584,9 @@ static int xe_oa_submit_bb(struct xe_oa_stream *stream, struct xe_bb *bb) > fence = dma_fence_get(&job->drm.s_fence->finished); > xe_sched_job_push(job); > > - timeout = dma_fence_wait_timeout(fence, false, HZ); > - dma_fence_put(fence); > - if (timeout < 0) > - err = timeout; > - else if (!timeout) > - err = -ETIME; > + return fence; > exit: > - return err; > + return ERR_PTR(err); > } > > static void write_cs_mi_lri(struct xe_bb *bb, const struct xe_oa_reg *reg_data, u32 n_regs) > @@ -656,6 +650,7 @@ static void xe_oa_store_flex(struct xe_oa_stream *stream, struct xe_lrc *lrc, > static int xe_oa_modify_ctx_image(struct xe_oa_stream *stream, struct xe_lrc *lrc, > const struct flex *flex, u32 count) > { > + struct dma_fence *fence; > struct xe_bb *bb; > int err; > > @@ -667,7 +662,16 @@ static int xe_oa_modify_ctx_image(struct xe_oa_stream *stream, struct xe_lrc *lr > > xe_oa_store_flex(stream, lrc, bb, flex, count); > > - err = xe_oa_submit_bb(stream, bb); > + fence = xe_oa_submit_bb(stream, bb); > + if (IS_ERR(fence)) { > + err = PTR_ERR(fence); > + goto free_bb; > + } > + xe_bb_free(bb, fence); > + dma_fence_put(fence); > + > + return 0; > +free_bb: > xe_bb_free(bb, NULL); > exit: > return err; > @@ -675,6 +679,7 @@ static int xe_oa_modify_ctx_image(struct xe_oa_stream *stream, struct xe_lrc *lr > > static int xe_oa_load_with_lri(struct xe_oa_stream *stream, struct xe_oa_reg *reg_lri) > { > + struct dma_fence *fence; > struct xe_bb *bb; > int err; > > @@ -686,7 +691,16 @@ static int xe_oa_load_with_lri(struct xe_oa_stream *stream, struct xe_oa_reg *re > > write_cs_mi_lri(bb, reg_lri, 1); > > - err = xe_oa_submit_bb(stream, bb); > + fence = xe_oa_submit_bb(stream, bb); > + if (IS_ERR(fence)) { > + err = PTR_ERR(fence); > + goto free_bb; > + } > + xe_bb_free(bb, fence); > + dma_fence_put(fence); > + > + return 0; > +free_bb: > xe_bb_free(bb, NULL); > exit: > return err; > @@ -914,15 +928,32 @@ static int xe_oa_emit_oa_config(struct xe_oa_stream *stream, struct xe_oa_config > { > #define NOA_PROGRAM_ADDITIONAL_DELAY_US 500 > struct xe_oa_config_bo *oa_bo; > - int err, us = NOA_PROGRAM_ADDITIONAL_DELAY_US; > + int err = 0, us = NOA_PROGRAM_ADDITIONAL_DELAY_US; > + struct dma_fence *fence; > + long timeout; > > + /* Emit OA configuration batch */ > oa_bo = xe_oa_alloc_config_buffer(stream, config); > if (IS_ERR(oa_bo)) { > err = PTR_ERR(oa_bo); > goto exit; > } > > - err = xe_oa_submit_bb(stream, oa_bo->bb); > + fence = xe_oa_submit_bb(stream, oa_bo->bb); > + if (IS_ERR(fence)) { > + err = PTR_ERR(fence); > + goto exit; > + } > + > + /* Wait till all previous batches have executed */ > + timeout = dma_fence_wait_timeout(fence, false, 5 * HZ); > + dma_fence_put(fence); > + if (timeout < 0) > + err = timeout; > + else if (!timeout) > + err = -ETIME; > + if (err) > + drm_dbg(&stream->oa->xe->drm, "dma_fence_wait_timeout err %d\n", err); > > /* Additional empirical delay needed for NOA programming after registers are written */ > usleep_range(us, 2 * us);