Re: [PATCH v2 1/3] drm/panthor: introduce job cycle and timestamp accounting

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, 23 Apr 2024 22:32:34 +0100
Adrián Larumbe <adrian.larumbe@xxxxxxxxxxxxx> wrote:

> Enable calculations of job submission times in clock cycles and wall
> time. This is done by expanding the boilerplate command stream when running
> a job to include instructions that compute said times right before an after
> a user CS.
> 
> Those numbers are stored in the queue's group's sync objects BO, right
> after them. Because the queues in a group might have a different number of
> slots, one must keep track of the overall slot tally when reckoning the
> offset of a queue's time sample structs, one for each slot.
> 
> NUM_INSTRS_PER_SLOT had to be increased to 32 because of adding new FW
> instructions for storing and subtracting the cycle counter and timestamp
> register, and it must always remain a power of two.
> 
> This commit is done in preparation for enabling DRM fdinfo support in the
> Panthor driver, which depends on the numbers calculated herein.
> 
> Signed-off-by: Adrián Larumbe <adrian.larumbe@xxxxxxxxxxxxx>
> ---
>  drivers/gpu/drm/panthor/panthor_sched.c | 158 ++++++++++++++++++++----
>  1 file changed, 134 insertions(+), 24 deletions(-)
> 
> diff --git a/drivers/gpu/drm/panthor/panthor_sched.c b/drivers/gpu/drm/panthor/panthor_sched.c
> index b3a51a6de523..320dfa0388ba 100644
> --- a/drivers/gpu/drm/panthor/panthor_sched.c
> +++ b/drivers/gpu/drm/panthor/panthor_sched.c
> @@ -93,6 +93,9 @@
>  #define MIN_CSGS				3
>  #define MAX_CSG_PRIO				0xf
>  
> +#define NUM_INSTRS_PER_SLOT			32
> +#define SLOTSIZE				(NUM_INSTRS_PER_SLOT * sizeof(u64))

Given everyone agreed on the profiling sysfs knob for Panfrost, I'm
tempted to make the profiling optional here as well, so we can save
space on the CS ring buffers when profiling is disabled. This means
adjusting the 'credits' parameter we pass to drm_sched_job_init()
accordingly, with one credit counting for an instruction (or a block of
16 instructions to keep things naturally cache-line aligned). You'll
also need to change the 'credit_limit' passed to drm_sched_init().
>  





[Index of Archives]     [Linux DRI Users]     [Linux Intel Graphics]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [XFree86]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Linux Kernel]     [Linux SCSI]     [XFree86]
  Powered by Linux