-----Original Message----- From: Wajdeczko, Michal <Michal.Wajdeczko@xxxxxxxxx> Sent: Saturday, March 15, 2025 7:45 AM To: Cavitt, Jonathan <jonathan.cavitt@xxxxxxxxx>; intel-xe@xxxxxxxxxxxxxxxxxxxxx Cc: Gupta, saurabhg <saurabhg.gupta@xxxxxxxxx>; Zuo, Alex <alex.zuo@xxxxxxxxx>; joonas.lahtinen@xxxxxxxxxxxxxxx; Brost, Matthew <matthew.brost@xxxxxxxxx>; Zhang, Jianxun <jianxun.zhang@xxxxxxxxx>; Lin, Shuicheng <shuicheng.lin@xxxxxxxxx>; dri-devel@xxxxxxxxxxxxxxxxxxxxx; michal.mzorek@xxxxxxxxx Subject: Re: [PATCH v8 2/6] drm/xe/xe_gt_pagefault: Move pagefault struct to header > > On 14.03.2025 23:06, Cavitt, Jonathan wrote: > > -----Original Message----- > > From: Wajdeczko, Michal <Michal.Wajdeczko@xxxxxxxxx> > > Sent: Friday, March 14, 2025 10:02 AM > > To: Cavitt, Jonathan <jonathan.cavitt@xxxxxxxxx>; intel-xe@xxxxxxxxxxxxxxxxxxxxx > > Cc: Gupta, saurabhg <saurabhg.gupta@xxxxxxxxx>; Zuo, Alex <alex.zuo@xxxxxxxxx>; joonas.lahtinen@xxxxxxxxxxxxxxx; Brost, Matthew <matthew.brost@xxxxxxxxx>; Zhang, Jianxun <jianxun.zhang@xxxxxxxxx>; Lin, Shuicheng <shuicheng.lin@xxxxxxxxx>; dri-devel@xxxxxxxxxxxxxxxxxxxxx; michal.mzorek@xxxxxxxxx > > Subject: Re: [PATCH v8 2/6] drm/xe/xe_gt_pagefault: Move pagefault struct to header > >> > >> On 13.03.2025 19:34, Jonathan Cavitt wrote: > >>> Move the pagefault struct from xe_gt_pagefault.c to the > >>> xe_gt_pagefault_types.h header file, along with the associated enum values. > >>> > >>> v2: > >>> - Normalize names for common header (Matt Brost) > >>> > >>> v3: > >>> - s/Migrate/Move (Michal W) > >>> - s/xe_pagefault/xe_gt_pagefault (Michal W) > >>> - Create new header file, xe_gt_pagefault_types.h (Michal W) > >>> - Add kernel docs (Michal W) > >>> > >>> Signed-off-by: Jonathan Cavitt <jonathan.cavitt@xxxxxxxxx> > >>> Cc: Michal Wajdeczko <Michal.Wajdeczko@xxxxxxxxx> > >>> --- > >> > >> ... > >> > >>> diff --git a/drivers/gpu/drm/xe/xe_gt_pagefault.h b/drivers/gpu/drm/xe/xe_gt_pagefault.h > >>> index 839c065a5e4c..69b700c4915a 100644 > >>> --- a/drivers/gpu/drm/xe/xe_gt_pagefault.h > >>> +++ b/drivers/gpu/drm/xe/xe_gt_pagefault.h > >>> @@ -8,6 +8,8 @@ > >>> > >>> #include <linux/types.h> > >>> > >>> +#include "xe_gt_pagefault_types.h" > >> > >> it's not needed here, move it to .c > >> > >>> + > >>> struct xe_gt; > >>> struct xe_guc; > >>> > >>> diff --git a/drivers/gpu/drm/xe/xe_gt_pagefault_types.h b/drivers/gpu/drm/xe/xe_gt_pagefault_types.h > >>> new file mode 100644 > >>> index 000000000000..90b7085d4b8e > >>> --- /dev/null > >>> +++ b/drivers/gpu/drm/xe/xe_gt_pagefault_types.h > >>> @@ -0,0 +1,67 @@ > >>> +/* SPDX-License-Identifier: MIT */ > >>> +/* > >>> + * Copyright (c) 2022-2025 Intel Corporation > >>> + */ > >>> + > >>> +#ifndef _XE_GT_PAGEFAULT_TYPES_H_ > >>> +#define _XE_GT_PAGEFAULT_TYPES_H_ > >>> + > >> > >> don't forget to > >> > >> #include <linux/types.h> > > > > That explains why the kernel failed to compile on CI. It was compiling just > > fine locally, so that's why I missed this. > > > >> > >>> +/** > >>> + * struct xe_gt_pagefault - Structure of pagefaults returned by the > >>> + * pagefault handler > >>> + */ > >>> +struct xe_gt_pagefault { > >>> + /** @page_addr: faulted address of this pagefault */ > >>> + u64 page_addr; > >>> + /** @asid: ASID of this pagefault */ > >>> + u32 asid; > >>> + /** @pdata: PDATA of this pagefault */ > >>> + u16 pdata; > >>> + /** @vfid: VFID of this pagefault */ > >>> + u8 vfid; > >> > >> btw, IIRC the VFID from the descriptor will be zero'ed > >> does it make sense to keep it here? > > > > Is the argument that every time pf->vfid is accessed, it's guaranteed to be > > it's FIELD_GET(PFD_VFID, desc->dw2) guaranteed to be zero > > > zero? I can't counter that claim, but wouldn't it be safer to keep reporting > > the VFID in case we ever hit a case where it's no longer zero? > > it can't be non-zero, look at GuC ABI spec that says: > > ".. with the VF number being zero’d out in the descriptor." > > > > > Also, did we know it would always be zero when we were making the > > pagefault struct originally? If so, why did we include the vfid originally? > > dunno, ask authors, it was pushed part of the initial commit > dd08ebf6c352 that was *not* going through normal review cycle > > > > >> > >>> + /** > >>> + * @access_type: access type of this pagefault, as a value > >>> + * from xe_gt_pagefault_access_type > >>> + */ > >>> + u8 access_type; > >>> + /** > >>> + * @fault_type: fault type of this pagefault, as a value > >>> + * from xe_gt_pagefault_fault_type > >>> + */ > >>> + u8 fault_type; > >>> + /** @fault_level: fault level of this pagefault */ > >>> + u8 fault_level; > >>> + /** @engine_class: engine class this pagefault was reported on */ > >>> + u8 engine_class; > >>> + /** @engine_instance: engine instance this pagefault was reported on */ > >>> + u8 engine_instance; > >>> + /** @fault_unsuccessful: flag for if the pagefault recovered or not */ > >>> + u8 fault_unsuccessful; > >>> + /** @prefetch: unused */ > >>> + bool prefetch; > >>> + /** @trva_fault: is set if this is a TRTT fault */ > >>> + bool trva_fault; > >>> +}; > >>> + > >>> +/** > >>> + * enum xe_gt_pagefault_access_type - Access type reported to the xe_gt_pagefault > >>> + * struct. Saved to xe_gt_pagefault@access_type > >> > >> this seems to be copied from G2H descriptor as-is. > >> so shouldn't this be part of the GuC ABI? > >> or based on HW ABI if GuC is just a proxy > > > > What information should I be including in the kernel docs for these enums? > > are you asking about xe enums or GuC ABI defs/enums? > > for the xe enums, we can say this is what HW/GuC provides > for the HW defs, we can likely drop kernel-doc > we can just point to the Bspec in the commit msg Could you please provide the bspec number for the pagefault access and fault types? I tried searching for them on my end but wasn't able to find them. -Jonathan Cavitt > > > -Jonathan Cavitt > > > >> > >>> + */ > >>> +enum xe_gt_pagefault_access_type { > >>> + XE_GT_PAGEFAULT_ACCESS_TYPE_READ = 0, > >>> + XE_GT_PAGEFAULT_ACCESS_TYPE_WRITE = 1, > >>> + XE_GT_PAGEFAULT_ACCESS_TYPE_ATOMIC = 2, > >>> + XE_GT_PAGEFAULT_ACCESS_TYPE_RESERVED = 3, > >>> +}; > >>> + > >>> +/** > >>> + * enum xe_gt_pagefault_fault_type - Fault type reported to the xe_gt_pagefault > >>> + * struct. Saved to xe_gt_pagefault@fault_type > >> > >> ditto > >> > >>> + */ > >>> +enum xe_gt_pagefault_fault_type { > >>> + XE_GT_PAGEFAULT_TYPE_NOT_PRESENT = 0, > >>> + XE_GT_PAGEFAULT_TYPE_WRITE_ACCESS_VIOLATION = 1, > >>> + XE_GT_PAGEFAULT_TYPE_ATOMIC_ACCESS_VIOLATION = 2, > >>> +}; > >>> + > >>> +#endif > >> > >> > >