On Wed, Apr 26, 2023 at 04:56:59PM -0400, Rodrigo Vivi wrote: > Xe needs to align with other drivers on the way that the error states are > dumped, avoiding a Xe only error_state solution. The goal is to use devcoredump > infrastructure to report error states, since it produces a standardized way > by exposing a virtual and temporary /sys/class/devcoredump device. > > The initial goal is to have the simple_error_state in the devcoredump > so we start using the infrastructure. > > But this is just a start point to start building a useful and > organized crash dump, using standard infrastructure. Later this > will be changed to have output that can be parsed by tools and > used for error replay. We are certainly missing the GuC log, it would also be really nice to get the ftrace included too. Not sure if the later is easy, I know I looked into this on the i915 and couldn't figure it out but this was a while ago and admittedly didn't try all that hard. Matt > > Later, when we are in-tree, the goal is to collaborate with devcoredump > infrastructure with overall possible improvements, like multiple file support > for better organization of the dumps, snapshot support, dmesg extra print, > and whatever may make sense and help the overall infrastructure. > > Cc: Daniel Vetter <daniel.vetter@xxxxxxxx> > Signed-off-by: Rodrigo Vivi <rodrigo.vivi@xxxxxxxxx> > > Rodrigo Vivi (14): > drm/xe: Fix print of RING_EXECLIST_SQ_CONTENTS_HI > drm/xe: Introduce the dev_coredump infrastructure. > drm/xe: Do not take any action if our device was removed. > drm/xe: Extract non mapped regions out of GuC CTB into its own struct. > drm/xe: Convert GuC CT print to snapshot capture and print. > drm/xe: Add GuC CT snapshot to xe_devcoredump. > drm/xe: Introduce guc_submit_types.h with relevant structs. > drm/xe: Convert GuC Engine print to snapshot capture and print. > drm/xe: Add GuC Submit Engine snapshot to xe_devcoredump. > drm/xe: Convert Xe HW Engine print to snapshot capture and print. > drm/xe: Add HW Engine snapshot to xe_devcoredump. > drm/xe: Limit CONFIG_DRM_XE_SIMPLE_ERROR_CAPTURE to itself. > drm/xe: Convert VM print to snapshot capture and print. > drm/xe: Add VM snapshot to xe_devcoredump. > > drivers/gpu/drm/xe/Kconfig | 1 + > drivers/gpu/drm/xe/Makefile | 1 + > drivers/gpu/drm/xe/regs/xe_engine_regs.h | 3 +- > drivers/gpu/drm/xe/xe_devcoredump.c | 227 ++++++++++++++++++ > drivers/gpu/drm/xe/xe_devcoredump.h | 22 ++ > drivers/gpu/drm/xe/xe_devcoredump_types.h | 60 +++++ > drivers/gpu/drm/xe/xe_device_types.h | 4 + > drivers/gpu/drm/xe/xe_execlist.c | 4 +- > drivers/gpu/drm/xe/xe_gt_debugfs.c | 2 +- > drivers/gpu/drm/xe/xe_guc_ct.c | 275 +++++++++++++++------- > drivers/gpu/drm/xe/xe_guc_ct.h | 7 +- > drivers/gpu/drm/xe/xe_guc_ct_types.h | 46 +++- > drivers/gpu/drm/xe/xe_guc_fwif.h | 29 --- > drivers/gpu/drm/xe/xe_guc_submit.c | 258 ++++++++++++++------ > drivers/gpu/drm/xe/xe_guc_submit.h | 10 +- > drivers/gpu/drm/xe/xe_guc_submit_types.h | 155 ++++++++++++ > drivers/gpu/drm/xe/xe_hw_engine.c | 210 ++++++++++++----- > drivers/gpu/drm/xe/xe_hw_engine.h | 8 +- > drivers/gpu/drm/xe/xe_hw_engine_types.h | 78 ++++++ > drivers/gpu/drm/xe/xe_pci.c | 2 + > drivers/gpu/drm/xe/xe_vm.c | 140 +++++++++-- > drivers/gpu/drm/xe/xe_vm.h | 6 +- > drivers/gpu/drm/xe/xe_vm_types.h | 18 ++ > 23 files changed, 1288 insertions(+), 278 deletions(-) > create mode 100644 drivers/gpu/drm/xe/xe_devcoredump.c > create mode 100644 drivers/gpu/drm/xe/xe_devcoredump.h > create mode 100644 drivers/gpu/drm/xe/xe_devcoredump_types.h > create mode 100644 drivers/gpu/drm/xe/xe_guc_submit_types.h > > -- > 2.39.2