Hi Dave and Sima, Early drm-xe-next pull request for 6.12. Main reason for being much earlier than usual is to have the additional SIMD16 EU reported as it's a needed UAPI for Lunar Lake and Battlemage. It's sitting in drm-xe-next for a few weeks and userspace already testing with it. There's also a minor uapi change with change of error return code. The other 2 already propagated via -fixes. Other changes bring general improvements and cleanups to the driver, further support for SR-IOV as well as head Lunar Lake and Battlemage to the finish line of being officially exposed by the driver. Some bits still influx, so not yet there though. thanks Lucas De Marchi drm-xe-next-2024-07-30: drm-xe-next for 6.12 UAPI Changes: - Rename xe perf layer as xe observation layer, but was also made available via fixes to previous verison (Ashutosh) - Use write-back caching mode for system memory on DGFX, but was also mad available via fixes to previous version (Thomas) - Expose SIMD16 EU mask in topology query for userspace to know the type of EU, as available in PVC, Lunar Lake and Battlemage (Lucas) - Return ENOBUFS instead of ENOMEM in vm_bind if failure is tied to an array of binds (Matthew Brost) Driver Changes: - Log cleanup moving messages to debug priority (Michal Wajdeczko) - Add timeout to fences to adhere to dma_buf rules (Matthew Brost) - Rename old engine nomenclature to exec_queue (Matthew Brost) - Convert multiple bind ops to 1 job (Matthew Brost) - Add error injection for vm bind to help testing error path (Matthew Brost) - Fix error handling in page table to propagate correctly to userspace (Matthew Brost) - Re-organize and cleanup SR-IOV related registers (Michal Wajdeczko) - Make the device write barrier compatible with VF (Michal Wajdeczko) - New display workarounds for Battlemage (Matthew Auld) - New media workarounds for Lunar Lake and Battlemage (Ngai-Mint Kwan) - New graphics workarounds for Lunar Lake (Bommu Krishnaiah) - Tracepoint updates (Matthew Brost, Nirmoy Das) - Cleanup the header generation for OOB workarounds (Lucas De Marchi) - Fix leaking HDCP-related object (Nirmoy Das) - Serialize L2 flushes to avoid races (Tejas Upadhyay) - Log pid and comm on job timeout (José Roberto de Souza) - Simplify boilerplate code for live kunit (Michal Wajdeczko) - Improve kunit skips for live kunit (Michal Wajdeczko) - Fix xe_sync cleanup when handling xe_exec ioctl (Ashutosh Dixit) - Limit fair VF LMEM provisioning (Michal Wajdeczko) - New workaround to fence mmio writes in Lunar Lake (Tejas Upadhyay) - Warn on writes inaccessible register in VF (Michal Wajdeczko) - Fix register lookup in VF (Michal Wajdeczko) - Add GSC support for Battlemage (Alexander Usyskin) - Fix wedging only the GT in which timeout occurred (Matthew Brost) - Block device suspend when wedging (Matthew Brost) - Handle compression and migration changes for Battlemage (Akshata Jahagirdar) - Limit access of stolen memory for Lunar Lake (Uma Shankar) - Fail invalid addresses during user fence creation (Matthew Brost) - Refcount xe_file to safely and accurately store fdinfo stats (Umesh Nerlige Ramappa) - Cleanup and fix PM reference for TLB invalidation code (Matthew Brost) - Fix PM reference handling when communicating with GuC (Matthew Brost) - Add new BO flag for 2 MiB alignement and use in VF (Michal Wajdeczko) - Simplify MMIO setup for multi-tile platforms (Lucas De Marchi) - Add check for uninitialized access to OOB workarounds (Lucas De Marchi) - New GSC and HuC firmware blobs for Lunar Lake and Battlemage (Daniele Ceraolo Spurio) - Unify mmio wait logic (Gustavo Sousa) - Fix off-by-one when processing RTP rules (Lucas De Marchi) - Future-proof migrate logic with compressed PAT flag (Matt Roper) - Add WA kunit tests for Battlemage (Lucas De Marchi) - Test active tracking for workaorunds with kunit (Lucas De Marchi) - Add kunit tests for RTP with no actions (Lucas De Marchi) - Unify parse of OR rules in RTP (Lucas De Marchi) - Add performance tuning for Battlemage (Sai Teja Pottumuttu) - Make bit masks unsigned (Geert Uytterhoeven) The following changes since commit aaa08078e7251131f045ba248a68671db7f7bdf7: drm/xe/bmg: Apply Wa_22019338487 (2024-07-02 12:14:00 -0400) are available in the Git repository at: https://gitlab.freedesktop.org/drm/xe/kernel.git tags/drm-xe-next-2024-07-30 for you to fetch changes up to f2881dfdaaa9ec873dbd383ef5512fc31e576cbb: drm/xe/oa/uapi: Make bit masks unsigned (2024-07-30 13:45:38 -0700) ---------------------------------------------------------------- drm-xe-next for 6.12 UAPI Changes: - Rename xe perf layer as xe observation layer, but was also made available via fixes to previous verison (Ashutosh) - Use write-back caching mode for system memory on DGFX, but was also mad available via fixes to previous version (Thomas) - Expose SIMD16 EU mask in topology query for userspace to know the type of EU, as available in PVC, Lunar Lake and Battlemage (Lucas) - Return ENOBUFS instead of ENOMEM in vm_bind if failure is tied to an array of binds (Matthew Brost) Driver Changes: - Log cleanup moving messages to debug priority (Michal Wajdeczko) - Add timeout to fences to adhere to dma_buf rules (Matthew Brost) - Rename old engine nomenclature to exec_queue (Matthew Brost) - Convert multiple bind ops to 1 job (Matthew Brost) - Add error injection for vm bind to help testing error path (Matthew Brost) - Fix error handling in page table to propagate correctly to userspace (Matthew Brost) - Re-organize and cleanup SR-IOV related registers (Michal Wajdeczko) - Make the device write barrier compatible with VF (Michal Wajdeczko) - New display workarounds for Battlemage (Matthew Auld) - New media workarounds for Lunar Lake and Battlemage (Ngai-Mint Kwan) - New graphics workarounds for Lunar Lake (Bommu Krishnaiah) - Tracepoint updates (Matthew Brost, Nirmoy Das) - Cleanup the header generation for OOB workarounds (Lucas De Marchi) - Fix leaking HDCP-related object (Nirmoy Das) - Serialize L2 flushes to avoid races (Tejas Upadhyay) - Log pid and comm on job timeout (José Roberto de Souza) - Simplify boilerplate code for live kunit (Michal Wajdeczko) - Improve kunit skips for live kunit (Michal Wajdeczko) - Fix xe_sync cleanup when handling xe_exec ioctl (Ashutosh Dixit) - Limit fair VF LMEM provisioning (Michal Wajdeczko) - New workaround to fence mmio writes in Lunar Lake (Tejas Upadhyay) - Warn on writes inaccessible register in VF (Michal Wajdeczko) - Fix register lookup in VF (Michal Wajdeczko) - Add GSC support for Battlemage (Alexander Usyskin) - Fix wedging only the GT in which timeout occurred (Matthew Brost) - Block device suspend when wedging (Matthew Brost) - Handle compression and migration changes for Battlemage (Akshata Jahagirdar) - Limit access of stolen memory for Lunar Lake (Uma Shankar) - Fail invalid addresses during user fence creation (Matthew Brost) - Refcount xe_file to safely and accurately store fdinfo stats (Umesh Nerlige Ramappa) - Cleanup and fix PM reference for TLB invalidation code (Matthew Brost) - Fix PM reference handling when communicating with GuC (Matthew Brost) - Add new BO flag for 2 MiB alignement and use in VF (Michal Wajdeczko) - Simplify MMIO setup for multi-tile platforms (Lucas De Marchi) - Add check for uninitialized access to OOB workarounds (Lucas De Marchi) - New GSC and HuC firmware blobs for Lunar Lake and Battlemage (Daniele Ceraolo Spurio) - Unify mmio wait logic (Gustavo Sousa) - Fix off-by-one when processing RTP rules (Lucas De Marchi) - Future-proof migrate logic with compressed PAT flag (Matt Roper) - Add WA kunit tests for Battlemage (Lucas De Marchi) - Test active tracking for workaorunds with kunit (Lucas De Marchi) - Add kunit tests for RTP with no actions (Lucas De Marchi) - Unify parse of OR rules in RTP (Lucas De Marchi) - Add performance tuning for Battlemage (Sai Teja Pottumuttu) - Make bit masks unsigned (Geert Uytterhoeven) ---------------------------------------------------------------- Akshata Jahagirdar (7): drm/xe/migrate: Handle clear ccs logic for xe2 dgfx drm/xe/migrate: Add kunit to test clear functionality drm/xe/migrate: Add helper function to program identity map drm/xe/xe2: Introduce identity map for compressed pat for vram drm/xe/xe_migrate: Handle migration logic for xe2+ dgfx drm/xe/migrate: Add kunit to test migration functionality for BMG drm/xe/xe2: Do not run xe_bo_test for xe2+ dgfx Alexander Usyskin (1): drm/xe/gsc: add Battlemage support Ashutosh Dixit (2): drm/xe/uapi: Rename xe perf layer as xe observation layer drm/xe/exec: Fix minor bug related to xe_sync_entry_cleanup Bommu Krishnaiah (1): drm/xe/xe2lpg: Extend workaround 14021402888 Daniele Ceraolo Spurio (3): drm/xe/huc: Define HuC binary for LNL drm/xe/gsc: Define GSC binary for LNL drm/xe/huc: Define HuC binary for BMG Geert Uytterhoeven (1): drm/xe/oa/uapi: Make bit masks unsigned Gustavo Sousa (2): drm/xe: Remove stale declaration of xe_mmio_probe_vram() drm/xe/mmio: Use single logic for waiting functions Himal Prasad Ghimiray (1): drm/xe: Delete unused register from xe_regs.h José Roberto de Souza (1): drm/xe: Add process name and PID to job timedout message Lucas De Marchi (15): drm/xe/gt: Remove double include drm/xe: Generate oob before compiling anything drm/xe/uapi: Expose SIMD16 EU mask in topology query drm/xe: Fix warning on unreachable statement drm/xe: Refactor mmio setup for multi-tile drm/xe: Add assert for XE_WA() usage drm/xe/rtp: Fix off-by-one when processing rules drm/xe/kunit: Test WAs for BMG drm/xe/kunit: Rename count to count_sr_entries drm/xe/kunit: Test active rtp entries drm/xe/kunit: Rename rtp test cases drm/xe/kunit: Test rtp with no actions drm/xe/rtp: Simplify marking active workarounds drm/xe/rtp: Expand max rules/actions per entry again drm/xe: Migrate OOB WAs to OR rules Matt Roper (1): drm/xe/migrate: Future-proof compressed PAT check Matthew Auld (2): drm/xe/bmg: implement Wa_16023588340 drm/i915: disable fbc due to Wa_16023588340 Matthew Brost (23): drm/xe: Add timeout to preempt fences drm/xe: s/xe_tile_migrate_engine/xe_tile_migrate_exec_queue drm/xe: Add xe_vm_pgtable_update_op to xe_vma_ops drm/xe: Add xe_exec_queue_last_fence_test_dep drm/xe: Convert multiple bind ops into single job drm/xe: Update VM trace events drm/xe: Update PT layer with better error handling drm/xe: Add VM bind IOCTL error injection drm/xe: Drop trace_xe_hw_fence_free drm/xe: Wedge the entire device drm/xe: Don't suspend device upon wedge drm/xe: Validate user fence during creation drm/xe: Remove unused xe_sync_entry_wait drm/xe: Add xe_gt_tlb_invalidation_fence_init helper drm/xe: Drop xe_gt_tlb_invalidation_wait drm/xe: Hold a PM ref when GT TLB invalidations are inflight drm/xe: Build PM into GuC CT layer drm/xe: Fix xe_pt_abort_unbind drm/xe: Return -ENOBUFS if a kmalloc fails which is tied to an array of binds drm/xe: Store process name and pid in xe file drm/xe: Remove fence check from send_tlb_invalidation drm/xe: Fix possible UAF in guc_exec_queue_process_msg drm/xe: Assert G2H outstanding when releasing G2H Michal Wajdeczko (22): drm/xe/guc: Demote GuC IDs usage message to debug drm/xe: Fix register definition order in xe_regs.h drm/xe: Kill regs/xe_sriov_regs.h drm/xe: Use VF_CAP_REG for device wmb drm/xe/kunit: Kill xe_cur_kunit() drm/xe/kunit: Drop XE_TEST_EXPORT drm/xe/kunit: Simplify xe_bo live tests code layout drm/xe/kunit: Simplify xe_dma_buf live tests code layout drm/xe/kunit: Simplify xe_migrate live tests code layout drm/xe/kunit: Simplify xe_mocs live tests code layout drm/xe/pf: Limit fair VF LMEM provisioning drm/xe/vf: Track writes to inaccessible registers from VF drm/xe/vf: Fix register value lookup drm/xe: Introduce const cast helper drm/xe/tests: Add helpers for use in live tests drm/xe/tests: Convert xe_bo live tests drm/xe/tests: Convert xe_dma_buf live tests drm/xe/tests: Convert xe_migrate live tests drm/xe/tests: Convert xe_mocs live tests drm/xe/tests: Skip xe_mocs live tests on VF device drm/xe: Normalize NEEDS_64K BO flag drm/xe: Add NEEDS_2M BO flag Ngai-Mint Kwan (1): drm/xe/xe2lpm: Extend Wa_16021639441 Nirmoy Das (2): drm/xe/display/xe_hdcp_gsc: Free arbiter on driver removal drm/xe/pm: Add trace for pm functions Ohad Sharabi (1): drm/xe/oa: Don't use hardcoded values Sai Teja Pottumuttu (1): drm/xe/xe2hpg: Introduce performance tuning changes for Xe2_HPG Tejas Upadhyay (2): drm/xe/xe2: Make subsequent L2 flush sequential drm/xe/xe2: Add Wa_15015404425 Thomas Hellström (1): drm/xe: Use write-back caching mode for system memory on DGFX Uma Shankar (1): drm/xe/fbdev: Limit the usage of stolen for LNL+ Umesh Nerlige Ramappa (4): drm/xe: Move part of xe_file cleanup to a helper drm/xe: Add ref counting for xe_file drm/xe: Take a ref to xe file when user creates a VM drm/xe: Fix use after free when client stats are captured drivers/gpu/drm/i915/display/intel_display_wa.h | 8 + drivers/gpu/drm/i915/display/intel_fbc.c | 6 + drivers/gpu/drm/xe/Makefile | 25 +- drivers/gpu/drm/xe/display/intel_fbdev_fb.c | 6 +- drivers/gpu/drm/xe/display/xe_display_wa.c | 16 + drivers/gpu/drm/xe/display/xe_dsb_buffer.c | 8 + drivers/gpu/drm/xe/display/xe_fb_pin.c | 3 + drivers/gpu/drm/xe/display/xe_hdcp_gsc.c | 12 +- drivers/gpu/drm/xe/display/xe_plane_initial.c | 6 + drivers/gpu/drm/xe/regs/xe_gt_regs.h | 15 + drivers/gpu/drm/xe/regs/xe_regs.h | 12 +- drivers/gpu/drm/xe/regs/xe_sriov_regs.h | 23 - drivers/gpu/drm/xe/tests/Makefile | 6 +- drivers/gpu/drm/xe/tests/xe_bo.c | 45 +- drivers/gpu/drm/xe/tests/xe_bo_test.c | 21 - drivers/gpu/drm/xe/tests/xe_bo_test.h | 14 - drivers/gpu/drm/xe/tests/xe_dma_buf.c | 26 +- drivers/gpu/drm/xe/tests/xe_dma_buf_test.c | 20 - drivers/gpu/drm/xe/tests/xe_dma_buf_test.h | 13 - drivers/gpu/drm/xe/tests/xe_kunit_helpers.c | 39 + drivers/gpu/drm/xe/tests/xe_kunit_helpers.h | 2 + drivers/gpu/drm/xe/tests/xe_live_test_mod.c | 11 + drivers/gpu/drm/xe/tests/xe_migrate.c | 424 ++++++- drivers/gpu/drm/xe/tests/xe_migrate_test.c | 20 - drivers/gpu/drm/xe/tests/xe_migrate_test.h | 13 - drivers/gpu/drm/xe/tests/xe_mocs.c | 44 +- drivers/gpu/drm/xe/tests/xe_mocs_test.c | 21 - drivers/gpu/drm/xe/tests/xe_mocs_test.h | 14 - drivers/gpu/drm/xe/tests/xe_pci.c | 30 + drivers/gpu/drm/xe/tests/xe_pci_test.c | 4 +- drivers/gpu/drm/xe/tests/xe_pci_test.h | 2 + drivers/gpu/drm/xe/tests/xe_rtp_test.c | 219 +++- drivers/gpu/drm/xe/tests/xe_test.h | 10 +- drivers/gpu/drm/xe/tests/xe_wa_test.c | 1 + drivers/gpu/drm/xe/xe_bo.c | 58 +- drivers/gpu/drm/xe/xe_bo.h | 5 +- drivers/gpu/drm/xe/xe_bo_types.h | 5 +- drivers/gpu/drm/xe/xe_devcoredump.c | 10 +- drivers/gpu/drm/xe/xe_device.c | 135 ++- drivers/gpu/drm/xe/xe_device.h | 9 + drivers/gpu/drm/xe/xe_device_types.h | 32 +- drivers/gpu/drm/xe/xe_drm_client.c | 5 +- drivers/gpu/drm/xe/xe_exec.c | 14 +- drivers/gpu/drm/xe/xe_exec_queue.c | 33 +- drivers/gpu/drm/xe/xe_exec_queue.h | 2 + drivers/gpu/drm/xe/xe_exec_queue_types.h | 13 +- drivers/gpu/drm/xe/xe_execlist.c | 3 +- drivers/gpu/drm/xe/xe_gen_wa_oob.c | 16 +- drivers/gpu/drm/xe/xe_gt.c | 69 ++ drivers/gpu/drm/xe/xe_gt.h | 1 + drivers/gpu/drm/xe/xe_gt_sriov_pf.c | 2 +- drivers/gpu/drm/xe/xe_gt_sriov_pf_config.c | 2 + drivers/gpu/drm/xe/xe_gt_sriov_vf.c | 28 +- drivers/gpu/drm/xe/xe_gt_sriov_vf.h | 1 + drivers/gpu/drm/xe/xe_gt_tlb_invalidation.c | 205 ++-- drivers/gpu/drm/xe/xe_gt_tlb_invalidation.h | 12 +- drivers/gpu/drm/xe/xe_gt_tlb_invalidation_types.h | 4 + drivers/gpu/drm/xe/xe_gt_topology.c | 27 +- drivers/gpu/drm/xe/xe_gt_types.h | 27 +- drivers/gpu/drm/xe/xe_guc.c | 16 + drivers/gpu/drm/xe/xe_guc.h | 1 + drivers/gpu/drm/xe/xe_guc_ct.c | 11 +- drivers/gpu/drm/xe/xe_guc_id_mgr.c | 4 +- drivers/gpu/drm/xe/xe_guc_submit.c | 94 +- drivers/gpu/drm/xe/xe_guc_submit.h | 1 + drivers/gpu/drm/xe/xe_heci_gsc.c | 28 +- drivers/gpu/drm/xe/xe_heci_gsc.h | 10 +- drivers/gpu/drm/xe/xe_hw_fence.c | 1 - drivers/gpu/drm/xe/xe_irq.c | 2 + drivers/gpu/drm/xe/xe_lmtt.c | 4 +- drivers/gpu/drm/xe/xe_migrate.c | 528 +++++---- drivers/gpu/drm/xe/xe_migrate.h | 34 +- drivers/gpu/drm/xe/xe_mmio.c | 231 ++-- drivers/gpu/drm/xe/xe_mmio.h | 1 - drivers/gpu/drm/xe/xe_module.c | 6 +- drivers/gpu/drm/xe/xe_oa.c | 36 +- drivers/gpu/drm/xe/xe_observation.c | 93 ++ drivers/gpu/drm/xe/xe_observation.h | 20 + drivers/gpu/drm/xe/xe_pat.c | 11 +- drivers/gpu/drm/xe/xe_pci.c | 7 +- drivers/gpu/drm/xe/xe_perf.c | 92 -- drivers/gpu/drm/xe/xe_perf.h | 20 - drivers/gpu/drm/xe/xe_pm.c | 8 + drivers/gpu/drm/xe/xe_preempt_fence.c | 12 +- drivers/gpu/drm/xe/xe_pt.c | 1310 +++++++++++++-------- drivers/gpu/drm/xe/xe_pt.h | 14 +- drivers/gpu/drm/xe/xe_pt_types.h | 48 + drivers/gpu/drm/xe/xe_query.c | 4 +- drivers/gpu/drm/xe/xe_rtp.c | 42 +- drivers/gpu/drm/xe/xe_rtp.h | 4 +- drivers/gpu/drm/xe/xe_rtp_helpers.h | 6 + drivers/gpu/drm/xe/xe_sa.c | 7 + drivers/gpu/drm/xe/xe_sriov.c | 2 +- drivers/gpu/drm/xe/xe_sync.c | 20 +- drivers/gpu/drm/xe/xe_sync.h | 1 - drivers/gpu/drm/xe/xe_trace.h | 57 +- drivers/gpu/drm/xe/xe_trace_bo.h | 10 +- drivers/gpu/drm/xe/xe_tuning.c | 8 + drivers/gpu/drm/xe/xe_uc.c | 14 + drivers/gpu/drm/xe/xe_uc.h | 1 + drivers/gpu/drm/xe/xe_uc_fw.c | 3 + drivers/gpu/drm/xe/xe_vm.c | 699 +++++------ drivers/gpu/drm/xe/xe_vm.h | 2 + drivers/gpu/drm/xe/xe_vm_types.h | 55 +- drivers/gpu/drm/xe/xe_wa.c | 15 + drivers/gpu/drm/xe/xe_wa.h | 7 +- drivers/gpu/drm/xe/xe_wa_oob.rules | 2 + include/uapi/drm/xe_drm.h | 128 +- 108 files changed, 3610 insertions(+), 1977 deletions(-) create mode 100644 drivers/gpu/drm/xe/display/xe_display_wa.c delete mode 100644 drivers/gpu/drm/xe/regs/xe_sriov_regs.h delete mode 100644 drivers/gpu/drm/xe/tests/xe_bo_test.c delete mode 100644 drivers/gpu/drm/xe/tests/xe_bo_test.h delete mode 100644 drivers/gpu/drm/xe/tests/xe_dma_buf_test.c delete mode 100644 drivers/gpu/drm/xe/tests/xe_dma_buf_test.h delete mode 100644 drivers/gpu/drm/xe/tests/xe_migrate_test.c delete mode 100644 drivers/gpu/drm/xe/tests/xe_migrate_test.h delete mode 100644 drivers/gpu/drm/xe/tests/xe_mocs_test.c delete mode 100644 drivers/gpu/drm/xe/tests/xe_mocs_test.h create mode 100644 drivers/gpu/drm/xe/xe_observation.c create mode 100644 drivers/gpu/drm/xe/xe_observation.h delete mode 100644 drivers/gpu/drm/xe/xe_perf.c delete mode 100644 drivers/gpu/drm/xe/xe_perf.h