Re: [PATCH v3 0/7] Support fdinfo runtime and memory stats on Panthor

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Steven,

On 13.06.2024 16:28, Steven Price wrote:
> On 06/06/2024 01:49, Adrián Larumbe wrote:
> > This patch series enables userspace utilities like gputop and nvtop to
> > query a render context's fdinfo file and figure out rates of engine
> > and memory utilisation.
> > 
> > Previous discussion can be found at
> > https://lore.kernel.org/dri-devel/20240423213240.91412-1-adrian.larumbe@xxxxxxxxxxxxx/
> > 
> > Changelog:
> > v3:
> >  - Fixed some nits and removed useless bounds check in panthor_sched.c
> >  - Added support for sysfs profiling knob and optional job accounting
> >  - Added new patches for calculating size of internal BO's
> > v2:
> >  - Split original first patch in two, one for FW CS cycle and timestamp
> >  calculations and job accounting memory management, and a second one
> >  that enables fdinfo.
> >  - Moved NUM_INSTRS_PER_SLOT to the file prelude
> >  - Removed nelem variable from the group's struct definition.
> >  - Precompute size of group's syncobj BO to avoid code duplication.
> >  - Some minor nits.
> > 
> > 
> > Adrián Larumbe (7):
> >   drm/panthor: introduce job cycle and timestamp accounting
> >   drm/panthor: add DRM fdinfo support
> >   drm/panthor: enable fdinfo for memory stats
> >   drm/panthor: add sysfs knob for enabling job profiling
> >   drm/panthor: support job accounting
> >   drm/drm_file: add display of driver's internal memory size
> >   drm/panthor: register size of internal objects through fdinfo
> 
> The general shape of what you end up with looks correct, but these
> patches are now in a bit of a mess. It's confusing to review when the
> accounting is added unconditionally and then a sysfs knob is added which
> changes it all to be conditional. Equally that last patch (register size
> of internal objects through fdinfo) includes a massive amount of churn
> moving everything into an 'fdinfo' struct which really should be in a
> separate patch.

I do agree with you in that perhaps too many things change across successive
patches in the series. I think I can explain this because of the way the series
has evolved thorugh successive revisions.

In the last one of them, only the first three patches were present, and both
Liviu and Boris seemed happy with the shape they had taken, but then Boris
suggested adding the sysfs knob and optional profiling support rather than
submitting them as part of a different series like I had done in Panfrost. In
that spirit, I decided to keep the first three patches intact.

The last two patches are a bit more of an afterthought, and because they touch
on the drm fdinfo core, I understood they were more likely to be rejected for
now, at least until consensus with Tvrtko and other people involved in the
development of fdinfo had agreed on a way to report internal bo sizes.  However,
being also part of fdinfo, I thought this series was a good place to spark a
debate about them, even if they don't seem as seamlessly linked with the rest
of the work.

> Ideally this needs to be reworked into a logical series of patches with
> knowledge of what's coming next. E.g. the first patch could introduce
> the code for cycle/timestamp accounting but leave it disabled to be then
> enabled by the sysfs knob patch.
> 
> One thing I did notice though is that I wasn't seeing the GPU frequency
> change, looking more closely at this it seems like there's something
> dodgy going on with the devfreq code. From what I can make out I often
> end up in a situation where all contexts are idle every time tick_work()
> is called - I think this is simply because tick_work() is scheduled with
> a delay and by the time the delay has hit the work is complete. Nothing
> to do with this series, but something that needs looking into. I'm on
> holiday for a week but I'll try to look at this when I'm back.

Would you mind sharing what you do in UM to trigger this behaviour and also
maybe the debug traces you've written into the driver to confirm this?

> Steve
> 
> >  Documentation/gpu/drm-usage-stats.rst     |   4 +
> >  drivers/gpu/drm/drm_file.c                |   9 +-
> >  drivers/gpu/drm/msm/msm_drv.c             |   2 +-
> >  drivers/gpu/drm/panfrost/panfrost_drv.c   |   2 +-
> >  drivers/gpu/drm/panthor/panthor_devfreq.c |  10 +
> >  drivers/gpu/drm/panthor/panthor_device.c  |   2 +
> >  drivers/gpu/drm/panthor/panthor_device.h  |  21 ++
> >  drivers/gpu/drm/panthor/panthor_drv.c     |  83 +++++-
> >  drivers/gpu/drm/panthor/panthor_fw.c      |  16 +-
> >  drivers/gpu/drm/panthor/panthor_fw.h      |   5 +-
> >  drivers/gpu/drm/panthor/panthor_gem.c     |  67 ++++-
> >  drivers/gpu/drm/panthor/panthor_gem.h     |  16 +-
> >  drivers/gpu/drm/panthor/panthor_heap.c    |  23 +-
> >  drivers/gpu/drm/panthor/panthor_heap.h    |   6 +-
> >  drivers/gpu/drm/panthor/panthor_mmu.c     |   8 +-
> >  drivers/gpu/drm/panthor/panthor_mmu.h     |   3 +-
> >  drivers/gpu/drm/panthor/panthor_sched.c   | 304 +++++++++++++++++++---
> >  include/drm/drm_file.h                    |   7 +-
> >  18 files changed, 522 insertions(+), 66 deletions(-)
> > 
> > 
> > base-commit: 310ec03841a36e3f45fb528f0dfdfe5b9e84b037

Adrian Larumbe



[Index of Archives]     [Linux DRI Users]     [Linux Intel Graphics]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [XFree86]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Linux Kernel]     [Linux SCSI]     [XFree86]
  Powered by Linux