On 11/15/2024 3:28 PM, Victor Zhao wrote: > In a consecutive packet submission, for example unmap and query status, > when CP is reading wptr caused by unmap packet doorbell ring, if in some > case CP operates slower (e.g. doorbell_mode=1) and wptr has been updated > to next packet (query status), but the query status packet content has > not been flushed to memory yet, it will cause CP fetched stalled data. > > Adding mb to ensure ring buffer has been updated before updating wptr. > Also adding a mb to ensure wptr updated before doorbell ring. > > Signed-off-by: Victor Zhao <Victor.Zhao@xxxxxxx> > --- > drivers/gpu/drm/amd/amdkfd/kfd_kernel_queue.c | 5 +++++ > 1 file changed, 5 insertions(+) > > diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_kernel_queue.c b/drivers/gpu/drm/amd/amdkfd/kfd_kernel_queue.c > index 4843dcb9a5f7..55d18aed257b 100644 > --- a/drivers/gpu/drm/amd/amdkfd/kfd_kernel_queue.c > +++ b/drivers/gpu/drm/amd/amdkfd/kfd_kernel_queue.c > @@ -306,12 +306,17 @@ int kq_submit_packet(struct kernel_queue *kq) > if (amdgpu_amdkfd_is_fed(kq->dev->adev)) > return -EIO; > > + /* Make sure ring buffer is updated before wptr updated */ > + mb(); > + Maybe add a specific comment here to indicate this is especially needed in DOORBELL_MODE=1 when CP fetches value from WPTR memory instead of doorbell packet. Reviewed-by: Lijo Lazar <lijo.lazar@xxxxxxx> Thanks, Lijo > if (kq->dev->kfd->device_info.doorbell_size == 8) { > *kq->wptr64_kernel = kq->pending_wptr64; > + mb(); /* Make sure wptr updated before ring doorbell */ > write_kernel_doorbell64(kq->queue->properties.doorbell_ptr, > kq->pending_wptr64); > } else { > *kq->wptr_kernel = kq->pending_wptr; > + mb(); /* Make sure wptr updated before ring doorbell */ > write_kernel_doorbell(kq->queue->properties.doorbell_ptr, > kq->pending_wptr); > }