Re: [PATCH v5 06/10] drm/i915/guc: Update GuC's log-buffer-state access for error capture.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Matt, thank you for taking the time to review the codes.
Answering your question inline below.


On Fri, 2022-02-04 at 10:19 -0800, Matthew Brost wrote:
> On Wed, Jan 26, 2022 at 02:48:18AM -0800, Alan Previn wrote:
> > GuC log buffer regions for debug-log-events, crash-dumps and
> > error-state-capture are all a single bo allocation that includes
> > the guc_log_buffer_state structures.
> > 
> > Since the error-capture region is accessed with high priority at non-
> > deterministic times (as part of gpu coredump) while the debug-log-event
> > region is populated and accessed with different priorities, timings and
> > consumers, let's split out separate locks for buffer-state accesses
> > of each region.
> > 
> > Also, ensure a global mapping is made up front for the entire bo
> > throughout GuC operation so that dynamic mapping and unmapping isn't
> > required for error capture log access if relay-logging isn't running.
> > 
> > Additionally, while here, make some readibility improvements:
> > 1. change previous function names with "capture_logs" to
> >    "copy_debug_logs" to help make the distinction clearer.
> > 2. Update the guc log region mapping comments to order them
> >    according to the enum definition as per the GuC interface.
> > 
> > Signed-off-by: Alan Previn <alan.previn.teres.alexis@xxxxxxxxx>
> > ---
> >  drivers/gpu/drm/i915/gt/uc/intel_guc.h        |   2 +
> >  .../gpu/drm/i915/gt/uc/intel_guc_capture.c    |  47 ++++++
> >  .../gpu/drm/i915/gt/uc/intel_guc_capture.h    |   1 +
> >  drivers/gpu/drm/i915/gt/uc/intel_guc_log.c    | 135 +++++++++++-------
> >  drivers/gpu/drm/i915/gt/uc/intel_guc_log.h    |  16 ++-
> >  5 files changed, 141 insertions(+), 60 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc.h b/drivers/gpu/drm/i915/gt/uc/intel_guc.h
> > index 4e819853ec2e..be1ad7fa2bf8 100644
> > --- a/drivers/gpu/drm/i915/gt/uc/intel_guc.h
> > +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc.h
> > @@ -34,6 +34,8 @@ struct intel_guc {
> >  	struct intel_uc_fw fw;
> >  	/** @log: sub-structure containing GuC log related data and objects */
> >  	struct intel_guc_log log;
> > +	/** @log_state: states and locks for each subregion of GuC's log buffer */
> > +	struct intel_guc_log_stats log_state[GUC_MAX_LOG_BUFFER];
> 
> Why move this? This still probably should be sub-structure of
> intel_guc_log. Most of the access is from functions that pass in
> intel_guc_log, then retrieve the GuC object, only to access this new
> intel_guc_log_stats object. That layering seems wrong, if the argument
> to a function is intel_guc_log it should really try to access members
> within that object or below. Obv some exceptions exist but here it seems
> clear to me this is in the wrong place.
> 
So the reasoning i had was because because intel_guc_log module only managed
guc-relay-logging and guc-log-dumping for log-events but allocates the buffer
for 3 separate subregion/usages (i.e. log-events, crash-dump and error-capture).
That said, I did not want intel_guc_capture and relay-logging (or log-dumping)
to have to contend with the same lock because these two subregions are completely
independant of each other in terms of when they are being accessed and why.

However, after the redesign of rev5 (this rev), I now believe the intel_guc_capture
module does not require a lock because its only ever accessing its log
subregion in response to the ctb handler functions that run out of the same queue.
As we know intel_guc_capture reacts to G2H-error-capture-notification and G2H-context-reset
(that trickles into i915_gpu_coredump). At the point of i915_error_state dump,
intel_guc_capture module does not access the buffer - it merely dumps the already-parsed
and engine-dump-node (that was detached from error-capture's internal linked-list
and attached to the gpu_coredump structure in the second G2H above).

And of course, intel_guc_log only ever accesses the log-events subregion
and never the error-capture regions.

That said, i could revert the lock structure to the original and not have
intel_guc_capture using locks. What do you think?

...alan

> Another nit, I'd personally break this out into multiple patches.
> 
> e.g. 1 to rename relay log functions, 1 introducing intel_guc_log_stats
> + lock, and 1 adding intel_guc_capture_output_min_size_est. Maybe I'm
> missing another patch or two in there.
> 
> Not a blocker but it does make reviews easier.
> 
Will do.

> Matt
> 
> >  	/** @ct: the command transport communication channel */
> >  	struct intel_guc_ct ct;
> >  	/** @slpc: sub-structure containing SLPC related data and objects */
> > diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_capture.c b/drivers/gpu/drm/i915/gt/uc/intel_guc_capture.c
> > index 70d2ee841289..e7f99d051636 100644
> > --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_capture.c
> > +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_capture.c
> > @@ -651,6 +651,53 @@ int intel_guc_capture_prep_lists(struct intel_guc *guc, struct guc_ads *blob, u3
> >  	return PAGE_ALIGN(alloc_size);
> >  }
> >  
> > 2.25.1
> > 





[Index of Archives]     [AMD Graphics]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux