On 24/06/2024 12:23, Adrián Larumbe wrote: > Hi Steven, > > On 13.06.2024 16:28, Steven Price wrote: >> On 06/06/2024 01:49, Adrián Larumbe wrote: >>> This patch series enables userspace utilities like gputop and nvtop to >>> query a render context's fdinfo file and figure out rates of engine >>> and memory utilisation. >>> >>> Previous discussion can be found at >>> https://lore.kernel.org/dri-devel/20240423213240.91412-1-adrian.larumbe@xxxxxxxxxxxxx/ >>> >>> Changelog: >>> v3: >>> - Fixed some nits and removed useless bounds check in panthor_sched.c >>> - Added support for sysfs profiling knob and optional job accounting >>> - Added new patches for calculating size of internal BO's >>> v2: >>> - Split original first patch in two, one for FW CS cycle and timestamp >>> calculations and job accounting memory management, and a second one >>> that enables fdinfo. >>> - Moved NUM_INSTRS_PER_SLOT to the file prelude >>> - Removed nelem variable from the group's struct definition. >>> - Precompute size of group's syncobj BO to avoid code duplication. >>> - Some minor nits. >>> >>> >>> Adrián Larumbe (7): >>> drm/panthor: introduce job cycle and timestamp accounting >>> drm/panthor: add DRM fdinfo support >>> drm/panthor: enable fdinfo for memory stats >>> drm/panthor: add sysfs knob for enabling job profiling >>> drm/panthor: support job accounting >>> drm/drm_file: add display of driver's internal memory size >>> drm/panthor: register size of internal objects through fdinfo >> >> The general shape of what you end up with looks correct, but these >> patches are now in a bit of a mess. It's confusing to review when the >> accounting is added unconditionally and then a sysfs knob is added which >> changes it all to be conditional. Equally that last patch (register size >> of internal objects through fdinfo) includes a massive amount of churn >> moving everything into an 'fdinfo' struct which really should be in a >> separate patch. > > I do agree with you in that perhaps too many things change across successive > patches in the series. I think I can explain this because of the way the series > has evolved thorugh successive revisions. > > In the last one of them, only the first three patches were present, and both > Liviu and Boris seemed happy with the shape they had taken, but then Boris > suggested adding the sysfs knob and optional profiling support rather than > submitting them as part of a different series like I had done in Panfrost. In > that spirit, I decided to keep the first three patches intact. > > The last two patches are a bit more of an afterthought, and because they touch > on the drm fdinfo core, I understood they were more likely to be rejected for > now, at least until consensus with Tvrtko and other people involved in the > development of fdinfo had agreed on a way to report internal bo sizes. However, > being also part of fdinfo, I thought this series was a good place to spark a > debate about them, even if they don't seem as seamlessly linked with the rest > of the work. > >> Ideally this needs to be reworked into a logical series of patches with >> knowledge of what's coming next. E.g. the first patch could introduce >> the code for cycle/timestamp accounting but leave it disabled to be then >> enabled by the sysfs knob patch. >> >> One thing I did notice though is that I wasn't seeing the GPU frequency >> change, looking more closely at this it seems like there's something >> dodgy going on with the devfreq code. From what I can make out I often >> end up in a situation where all contexts are idle every time tick_work() >> is called - I think this is simply because tick_work() is scheduled with >> a delay and by the time the delay has hit the work is complete. Nothing >> to do with this series, but something that needs looking into. I'm on >> holiday for a week but I'll try to look at this when I'm back. > > Would you mind sharing what you do in UM to trigger this behaviour and also > maybe the debug traces you've written into the driver to confirm this? Debugging is tricky as adding a printk() completely changes the timing. My hack was just to record the count of calls to panthor_devfreq_record_{busy,idle}() and output that along with the debug message. See below. With that change I could see that when glmark I was seeing a number of calls to idle(), but rarely any calls to busy(). This obviously causes devfreq to sit at the lowest possible frequency. A possible fix is as simple as: ----8<---- diff --git a/drivers/gpu/drm/panthor/panthor_sched.c b/drivers/gpu/drm/panthor/panthor_sched.c index 79ffcbc41d78..42929e147107 100644 --- a/drivers/gpu/drm/panthor/panthor_sched.c +++ b/drivers/gpu/drm/panthor/panthor_sched.c @@ -2926,6 +2926,7 @@ queue_run_job(struct drm_sched_job *sched_job) pm_runtime_get(ptdev->base.dev); sched->pm.has_ref = true; } + panthor_devfreq_record_busy(sched->ptdev); } done_fence = dma_fence_get(job->done_fence); ----8<---- With that I see roughly as many calls to busy() as idle() and devfreq scales the GPU frequency up as expected. Steve ----8<---- diff --git a/drivers/gpu/drm/panthor/panthor_devfreq.c b/drivers/gpu/drm/panthor/panthor_devfreq.c index c6d3c327cc24..bfc06e58fff5 100644 --- a/drivers/gpu/drm/panthor/panthor_devfreq.c +++ b/drivers/gpu/drm/panthor/panthor_devfreq.c @@ -31,6 +31,8 @@ struct panthor_devfreq { /** @time_last_update: Last update time. */ ktime_t time_last_update; + int counta, counti; + /** @last_busy_state: True if the GPU was busy last time we updated the state. */ bool last_busy_state; @@ -76,6 +78,8 @@ static void panthor_devfreq_reset(struct panthor_devfreq *pdevfreq) { pdevfreq->busy_time = 0; pdevfreq->idle_time = 0; + pdevfreq->counta = 0; + pdevfreq->counti = 0; pdevfreq->time_last_update = ktime_get(); } @@ -97,14 +101,17 @@ static int panthor_devfreq_get_dev_status(struct device *dev, status->busy_time = ktime_to_ns(pdevfreq->busy_time); + int counta = pdevfreq->counta; + int counti = pdevfreq->counti; + panthor_devfreq_reset(pdevfreq); spin_unlock_irqrestore(&pdevfreq->lock, irqflags); - drm_dbg(&ptdev->base, "busy %lu total %lu %lu %% freq %lu MHz\n", + printk("busy %lu total %lu %lu %% freq %lu MHz count=%da,%di\n", status->busy_time, status->total_time, status->busy_time / (status->total_time / 100), - status->current_frequency / 1000 / 1000); + status->current_frequency / 1000 / 1000, counta,counti); return 0; } @@ -262,6 +269,7 @@ void panthor_devfreq_record_busy(struct panthor_device *ptdev) panthor_devfreq_update_utilization(pdevfreq); pdevfreq->last_busy_state = true; + pdevfreq->counta++; spin_unlock_irqrestore(&pdevfreq->lock, irqflags); } @@ -278,6 +286,7 @@ void panthor_devfreq_record_idle(struct panthor_device *ptdev) panthor_devfreq_update_utilization(pdevfreq); pdevfreq->last_busy_state = false; + pdevfreq->counti++; spin_unlock_irqrestore(&pdevfreq->lock, irqflags); }