Re: [PATCH v3 0/9] AMDGPU Usermode queues

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, Apr 10, 2023 at 3:40 PM Sharma, Shashank
<Shashank.Sharma@xxxxxxx> wrote:
>
> [AMD Official Use Only - General]
>
> Hello Bas,
>
> This is not the correct interpretation of the code, the USERQ_IOCTL has both the OPs (create and destroy), but th euser has to exclusively call  it.
>
> Please see the sample test program in the existing libDRM series (userq_test.c, it specifically calls amdgpu_free_userq, which does the destroy_OP
>
> for the IOCTL.

In the presence of crashes the kernel should always be able to clean
this up no? Otherwise there is a resource leak?

>
> - Shashank
>
> -----Original Message-----
> From: Bas Nieuwenhuizen <bas@xxxxxxxxxxxxxxxxxxx>
> Sent: 10 April 2023 11:26
> To: Sharma, Shashank <Shashank.Sharma@xxxxxxx>
> Cc: amd-gfx@xxxxxxxxxxxxxxxxxxxxx; Deucher, Alexander <Alexander.Deucher@xxxxxxx>; Kuehling, Felix <Felix.Kuehling@xxxxxxx>; Koenig, Christian <Christian.Koenig@xxxxxxx>; Yadav, Arvind <Arvind.Yadav@xxxxxxx>
> Subject: Re: [PATCH v3 0/9] AMDGPU Usermode queues
>
> Hi Shashank,
>
> I think I found the issue: I wasn't destroying the user queue in my program and the kernel doesn't clean up any remaining user queues in the postclose hook. I think we need something like
> https://github.com/BNieuwenhuizen/linux/commit/e90c8d1185da7353c12837973ceddf55ccc85d29
> ?
>
> While running things multiple times now works, I still have problems doing multiple submissions from the same queue. Looking forward to the updated test/sample
>
> Thanks,
> Bas
>
> On Mon, Apr 10, 2023 at 9:32 AM Sharma, Shashank <Shashank.Sharma@xxxxxxx> wrote:
> >
> > [AMD Official Use Only - General]
> >
> > Hello Bas,
> > Thanks for trying this out.
> >
> > This could be due to the doorbell as you mentioned, Usermode queue uses doorbell manager internally.
> > This week, we are planning to publis the latest libDRM sample code which uses a doorbell object (instead of the doorbell hack IOCTL), adapting to that should fix your problem in my opinion.
> > We have tested this full stack (libDRM test + Usermode queue + doorbell manager) for 500+ consecutive runs, and it worked well for us.
> >
> > You can use this integrated kernel stack (1+2) from my gitlab to build
> > your kernel:
> > https://gitlab.freedesktop.org/contactshashanksharma/userq-amdgpu/-/tr
> > ee/integrated-db-and-uq-v3 Please stay tuned for updated libDRM
> > changes with doorbell objects.
> >
> > Regards
> > Shashank
> > -----Original Message-----
> > From: Bas Nieuwenhuizen <bas@xxxxxxxxxxxxxxxxxxx>
> > Sent: 10 April 2023 02:37
> > To: Sharma, Shashank <Shashank.Sharma@xxxxxxx>
> > Cc: amd-gfx@xxxxxxxxxxxxxxxxxxxxx; Deucher, Alexander
> > <Alexander.Deucher@xxxxxxx>; Kuehling, Felix <Felix.Kuehling@xxxxxxx>;
> > Koenig, Christian <Christian.Koenig@xxxxxxx>
> > Subject: Re: [PATCH v3 0/9] AMDGPU Usermode queues
> >
> > Hi Shashank,
> >
> > I tried writing a program to experiment with usermode queues and I found some weird behavior: The first run of the program works as expected, while subsequent runs don't seem to do anything (and I allocate everything in GTT, so it should be zero initialized consistently). Is this a known issue?
> >
> > The linked libdrm code for the uapi still does a doorbell ioctl so it could very well be that I do the doorbell wrong (especially since the ioctl implementation was never shared AFAICT?), but it seems like the kernel submissions (i.e. write wptr in dwords to the wptr va and to the doorbell). Is it possible to update the test in libdrm?
> >
> > Code: https://gitlab.freedesktop.org/bnieuwenhuizen/usermode-queue
> >
> > Thanks,
> > Bas
> >
> > On Wed, Mar 29, 2023 at 6:05 PM Shashank Sharma <shashank.sharma@xxxxxxx> wrote:
> > >
> > > This patch series introduces AMDGPU usermode queues for gfx workloads.
> > > Usermode queues is a method of GPU workload submission into the
> > > graphics hardware without any interaction with kernel/DRM schedulers.
> > > In this method, a userspace graphics application can create its own
> > > workqueue and submit it directly in the GPU HW.
> > >
> > > The general idea of how this is supposed to work:
> > > - The application creates the following GPU objetcs:
> > >   - A queue object to hold the workload packets.
> > >   - A read pointer object.
> > >   - A write pointer object.
> > >   - A doorbell page.
> > > - The application picks a 32-bit offset in the doorbell page for this queue.
> > > - The application uses the usermode_queue_create IOCTL introduced in
> > >   this patch, by passing the the GPU addresses of these objects (read
> > >   ptr, write ptr, queue base address and 32-bit doorbell offset from
> > >   the doorbell page)
> > > - The kernel creates the queue and maps it in the HW.
> > > - The application can start submitting the data in the queue as soon as
> > >   the kernel IOCTL returns.
> > > - After filling the workload data in the queue, the app must write the
> > >   number of dwords added in the queue into the doorbell offset, and the
> > >   GPU will start fetching the data.
> > >
> > > libDRM changes for this series and a sample DRM test program can be
> > > found in the MESA merge request here:
> > > https://gitlab.freedesktop.org/mesa/drm/-/merge_requests/287
> > >
> > > This patch series depends on the doorbell-manager changes, which are
> > > being reviewed here:
> > > https://patchwork.freedesktop.org/series/115802/
> > >
> > > Alex Deucher (1):
> > >   drm/amdgpu: UAPI for user queue management
> > >
> > > Arvind Yadav (2):
> > >   drm/amdgpu: add new parameters in v11_struct
> > >   drm/amdgpu: map wptr BO into GART
> > >
> > > Shashank Sharma (6):
> > >   drm/amdgpu: add usermode queue base code
> > >   drm/amdgpu: add new IOCTL for usermode queue
> > >   drm/amdgpu: create GFX-gen11 MQD for userqueue
> > >   drm/amdgpu: create context space for usermode queue
> > >   drm/amdgpu: map usermode queue into MES
> > >   drm/amdgpu: generate doorbell index for userqueue
> > >
> > >  drivers/gpu/drm/amd/amdgpu/Makefile           |   3 +
> > >  drivers/gpu/drm/amd/amdgpu/amdgpu.h           |  10 +-
> > >  drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c       |   2 +
> > >  drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c       |   6 +
> > >  drivers/gpu/drm/amd/amdgpu/amdgpu_userqueue.c | 298
> > > ++++++++++++++++++  .../drm/amd/amdgpu/amdgpu_userqueue_gfx_v11.c |
> > > ++++++++++++++++++ 230 ++++++++++++++
> > >  .../gpu/drm/amd/include/amdgpu_userqueue.h    |  66 ++++
> > >  drivers/gpu/drm/amd/include/v11_structs.h     |  16 +-
> > >  include/uapi/drm/amdgpu_drm.h                 |  55 ++++
> > >  9 files changed, 677 insertions(+), 9 deletions(-)  create mode
> > > 100644 drivers/gpu/drm/amd/amdgpu/amdgpu_userqueue.c
> > >  create mode 100644
> > > drivers/gpu/drm/amd/amdgpu/amdgpu_userqueue_gfx_v11.c
> > >  create mode 100644 drivers/gpu/drm/amd/include/amdgpu_userqueue.h
> > >
> > > --
> > > 2.40.0
> > >




[Index of Archives]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux