Re: [PATCH 0/5] NVKM GSP RPC message handling policy

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]<

 



On 8/2/25 03:58, Zhi Wang wrote:

Ben reported an issue that the patch [1] breaks the suspend/resume.

After digging for a while, I noticed that this problem had been there
before introducing that patch, but not exposed because r535_gsp_rpc_push()
doesn't repsect the caller's requirement when handling the large RPC
command: It won't wait for the reply even the caller requires. (Small
RPCs are fine.)

After that patch series is introduced, r535_gsp_rpc_push() really waits
for the reply and receives the entire GSP message, which is required
by the large vGPU RPC command.

There are currently two GSP RPC message handling policy:

- a. dont care. discard the message before returning to the caller.
- b. receive the entire message. wait and receive the entire message before
   returning to the caller.

On the path of suspend/resume, there is a large GSP command
NV_VGPU_MSG_FUNCTION_ALLOC_MEMORY, which returns only a GSP RPC message
header to tell the driver that the request is handled. The policy in the
driver is to receive the entrie message, which ends up with a timeout
and error when r535_gsp_rpc_push() tries to receive the message. That
breaks the suspend/resume path.

This series factors out the current GSP RPC message handling policy and
introduces a new policy for NV_VGPU_MSG_FUNCTION_ALLOC_MEMORY and a
kernel doc to illustrate the policies.

With this patchset, the problem can't be reproduced and suspend/resume
works on my L40.

This seems to fix the issue here on top of current drm-misc-next.

Tested-by: Ben Skeggs <bskeggs@xxxxxxxxxx>


[1] https://lore.kernel.org/nouveau/7eb31f1f-fc3a-4fb5-86cf-4bd011d68ff1@xxxxxxxxxx/T/#t

Zhi Wang (5):
   drm/nouveau/nvkm: factor out r535_gsp_rpc_handle_reply()
   drm/nouveau/nvkm: factor out the current RPC command reply policies
   drm/nouveau/nvkm: introduce new GSP reply policy
     NVKM_GSP_RPC_REPLY_POLL
   drm/nouveau/nvkm: use the new policy for
     NV_VGPU_MSG_FUNCTION_ALLOC_MEMORY
   drm/nouveau/nvkm: introduce a kernel doc for GSP message handling

  Documentation/gpu/nouveau.rst                 |  3 +
  .../gpu/drm/nouveau/include/nvkm/subdev/gsp.h | 34 ++++++--
  .../gpu/drm/nouveau/nvkm/subdev/bar/r535.c    |  2 +-
  .../gpu/drm/nouveau/nvkm/subdev/gsp/r535.c    | 80 +++++++++++--------
  .../drm/nouveau/nvkm/subdev/instmem/r535.c    |  2 +-
  5 files changed, 78 insertions(+), 43 deletions(-)




[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Security]     [Bugtraq]     [Linux]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux