Hi, This is continuation of the first and second submission of Intel Xe GPU debug support: v1: https://lists.freedesktop.org/archives/intel-xe/2024-July/043605.html v2: https://lists.freedesktop.org/archives/intel-xe/2024-October/052260.html New features in v3: - EXEC_QUEUE_PLACEMENT events providing detailed information about engines participating on exec queue. (Dominik Grzegorzek) - EU thread page fault support (Gwan-gyeong Mun) - Fixed access to VRAM backed storage (Matthew Brost) Essential for BMG enabling. This work was already merged into xe driver and eudebug takes advantage of that (ttm_bo_access). [8]. - Support for Pantherlake (Dominik Grzegorzek) v3 supports: - Lunarlake (LNL) - Battlemage (BMG) - Pantherlake (PTL) Thanks to all contributors! Latest code can be found in: [1] https://gitlab.freedesktop.org/miku/kernel/-/tree/eudebug-dev Branch for this submission: [2] https://gitlab.freedesktop.org/miku/kernel/-/tree/eudebug-v3 README/instructions: [3] https://gitlab.freedesktop.org/miku/kernel IGT tests (needs config switch 'xe_eudebug' to be set) [4] https://gitlab.freedesktop.org/drm/igt-gpu-tools.git https://gitlab.freedesktop.org/gfx-ci/i915-infra/-/blob/master/kconfig/debug.kconfig The user for this uapi: [5] https://github.com/intel/compute-runtime Event loop and thread control interaction can be found at: https://github.com/intel/compute-runtime/tree/master/level_zero/tools/source/debug/linux/xe And the wrappers in: https://github.com/intel/compute-runtime/tree/master/shared/source/os_interface/linux/xe https://github.com/intel/compute-runtime/blob/master/shared/source/os_interface/linux/xe/ioctl_helper_xe_debugger.cpp Note that the XE support is disabled by default and you will need NEO_ENABLE_XE_EU_DEBUG_SUPPORT enabled in order to test. GDB support: [6] https://github.com/intel/gdb/tree/upstream/intelgt-mvp [7] https://github.com/intel/gdb/tree/upstream/intelgt-mvp-plus GDB is preparing their own mailing list submission with above and based on v3. I will reply to this cover letter and update README when it happens. [8]: https://lists.freedesktop.org/archives/intel-xe/2024-November/060247.html Fix non-contiguous VRAM BO access in Xe Thanks, Mika Andrzej Hajda (2): drm/xe: add system memory page iterator support to xe_res_cursor drm/xe/eudebug: implement userptr_vma access Christoph Manszewski (3): drm/xe/eudebug: Add vm bind and vm bind ops drm/xe/eudebug: Dynamically toggle debugger functionality drm/xe/eudebug_test: Introduce xe_eudebug wa kunit test Dominik Grzegorzek (11): drm/xe/eudebug: Introduce exec_queue events drm/xe/eudebug: Introduce exec queue placements event drm/xe/eudebug: hw enablement for eudebug drm/xe: Add EUDEBUG_ENABLE exec queue property drm/xe/eudebug: Introduce per device attention scan worker drm/xe/eudebug: Introduce EU control interface drm/xe: Debug metadata create/destroy ioctls drm/xe: Attach debug metadata to vma drm/xe/eudebug: Add debug metadata support for xe_eudebug drm/xe/eudebug/ptl: Add support for extra attention register drm/xe/eudebug/ptl: Add RCU_DEBUG_1 register support for xe3 Gwan-gyeong Mun (4): drm/xe/eudebug: Add read/count/compare helper for eu attention drm/xe/eudebug: Introduce EU pagefault handling interface drm/xe/vm: Support for adding null page VMA to VM on request drm/xe/eudebug: Enable EU pagefault handling Mika Kuoppala (6): ptrace: export ptrace_may_access drm/xe/eudebug: Introduce eudebug support drm/xe/eudebug: Introduce discovery for resources drm/xe/eudebug: Add UFENCE events with acks drm/xe/eudebug: vm open/pread/pwrite drm/xe/eudebug: Implement vm_bind_op discovery drivers/gpu/drm/xe/Kconfig | 10 + drivers/gpu/drm/xe/Makefile | 4 + drivers/gpu/drm/xe/regs/xe_engine_regs.h | 7 + drivers/gpu/drm/xe/regs/xe_gt_regs.h | 43 + drivers/gpu/drm/xe/tests/xe_eudebug.c | 176 + drivers/gpu/drm/xe/tests/xe_live_test_mod.c | 5 + drivers/gpu/drm/xe/xe_debug_metadata.c | 233 + drivers/gpu/drm/xe/xe_debug_metadata.h | 102 + drivers/gpu/drm/xe/xe_debug_metadata_types.h | 25 + drivers/gpu/drm/xe/xe_device.c | 25 +- drivers/gpu/drm/xe/xe_device.h | 36 + drivers/gpu/drm/xe/xe_device_types.h | 54 + drivers/gpu/drm/xe/xe_eudebug.c | 4451 ++++++++++++++++++ drivers/gpu/drm/xe/xe_eudebug.h | 128 + drivers/gpu/drm/xe/xe_eudebug_types.h | 448 ++ drivers/gpu/drm/xe/xe_exec.c | 2 +- drivers/gpu/drm/xe/xe_exec_queue.c | 56 +- drivers/gpu/drm/xe/xe_exec_queue.h | 2 + drivers/gpu/drm/xe/xe_exec_queue_types.h | 7 + drivers/gpu/drm/xe/xe_execlist.c | 2 +- drivers/gpu/drm/xe/xe_gt_debug.c | 212 + drivers/gpu/drm/xe/xe_gt_debug.h | 46 + drivers/gpu/drm/xe/xe_gt_pagefault.c | 87 +- drivers/gpu/drm/xe/xe_gt_pagefault.h | 2 + drivers/gpu/drm/xe/xe_hw_engine.c | 1 + drivers/gpu/drm/xe/xe_lrc.c | 16 +- drivers/gpu/drm/xe/xe_lrc.h | 4 +- drivers/gpu/drm/xe/xe_oa.c | 3 +- drivers/gpu/drm/xe/xe_reg_sr.c | 21 +- drivers/gpu/drm/xe/xe_reg_sr.h | 4 +- drivers/gpu/drm/xe/xe_res_cursor.h | 51 +- drivers/gpu/drm/xe/xe_rtp.c | 2 +- drivers/gpu/drm/xe/xe_sync.c | 45 +- drivers/gpu/drm/xe/xe_sync.h | 8 +- drivers/gpu/drm/xe/xe_sync_types.h | 28 +- drivers/gpu/drm/xe/xe_vm.c | 196 +- drivers/gpu/drm/xe/xe_vm.h | 5 + drivers/gpu/drm/xe/xe_vm_types.h | 40 + drivers/gpu/drm/xe/xe_wa_oob.rules | 2 + include/uapi/drm/xe_drm.h | 96 +- include/uapi/drm/xe_drm_eudebug.h | 256 + kernel/ptrace.c | 1 + 42 files changed, 6869 insertions(+), 73 deletions(-) create mode 100644 drivers/gpu/drm/xe/tests/xe_eudebug.c create mode 100644 drivers/gpu/drm/xe/xe_debug_metadata.c create mode 100644 drivers/gpu/drm/xe/xe_debug_metadata.h create mode 100644 drivers/gpu/drm/xe/xe_debug_metadata_types.h create mode 100644 drivers/gpu/drm/xe/xe_eudebug.c create mode 100644 drivers/gpu/drm/xe/xe_eudebug.h create mode 100644 drivers/gpu/drm/xe/xe_eudebug_types.h create mode 100644 drivers/gpu/drm/xe/xe_gt_debug.c create mode 100644 drivers/gpu/drm/xe/xe_gt_debug.h create mode 100644 include/uapi/drm/xe_drm_eudebug.h -- 2.43.0