This patch series addresses a critical issue in the VGIC ITS tables' save/restore mechanism, accompanied by a comprehensive selftest for bug reproduction and verification. The identified bug manifests as a failure in VM suspend/resume operations. The root cause lies in the repeated suspend attempts often required for successful VM suspension, coupled with concurrent device interrupt registration and freeing. This concurrency leads to inconsistencies in ITS mappings before the save operation, potentially leaving orphaned Device Translation Entries (DTEs) and Interrupt Translation Entries (ITEs) in the respective tables. During the subsequent restore operation, encountering these orphaned entries can result in two error scenarios: * EINVAL Error: If an orphaned entry lacks a corresponding collection ID, the restore operation fails with an EINVAL error. * Mapping Corruption: If an orphaned entry possesses a valid collection ID, the restore operation may succeed but with incorrect or lost mappings, compromising system integrity. The provided selftest facilitates the reproduction of both error scenarios: * EINVAL Reproduction: Execute ./vgic_its_tables without any options. * Mapping Corruption Reproduction: Execute ./vgic_its_tables -s The -s option enforces identical collection IDs for all mappings. * A workaround within the selftest involves clearing the tables before the save operation using the command ./vgic_its_tables -c. With this, we can run the the selftest successfully on host w/o the fix. The core issue stems from the static linked list implementation of DTEs/ITEs, requiring a full table scan to locate the list head during restoration. This scan increases the likelihood of encountering orphaned entries. To rectify this, the patch series introduces a dummy head to the list, enabling immediate access to the list head and bypassing the scan. This optimization not only resolves the bug but also significantly enhances restore performance, particularly in edge cases where valid entries reside at the end of the table. Result from the test demonstrates a remarkable 1000x performance improvement in such edge cases. For instance, with a single L2 device table (64KB) and 8192 mappings (one event per device at the table's end), the restore time is reduced from 6 seconds to 6 milliseconds. Importantly, these modifications maintain compatibility with the existing ITS TABLE ABI REV0. The table entry was a valid DTE/ITE, or an orphaned DTE/ITE, or an entry of 0. The dummy entry added in this patch series presents a fourth kind, which is an invalid entry w/ an offset field pointing to the first valid entry in the table. The dummy head entry is always the first entry in the table if it exists. An alternative solution, proposed in patch series [1], involves clearing DTEs/ITEs during MAPD/DISCARD commands. While this approach requires fewer code changes, it lacks the performance benefits offered by the dummy head solution presented in this patch series. --- * v1: - Based on v6.12-rc6 [1] https://lore.kernel.org/linux-arm-kernel/20240704142319.728-1-jiangkunkun@xxxxxxxxxx --- Jing Zhang (4): KVM: selftests: aarch64: Test VGIC ITS tables save/restore KVM: arm64: vgic-its: Add a dummy DTE/ITE if necessary in ITS tables save operation KVM: arm64: vgic-its: Return device/event id instead of offset in ITS tables restore KVM: arm64: vgic-its: Utilize the dummy entry in ITS tables restoring arch/arm64/kvm/vgic/vgic-its.c | 154 +++-- arch/arm64/kvm/vgic/vgic.h | 6 + tools/testing/selftests/kvm/Makefile | 1 + .../selftests/kvm/aarch64/vgic_its_tables.c | 562 ++++++++++++++++++ .../kvm/include/aarch64/gic_v3_its.h | 3 +- .../testing/selftests/kvm/include/kvm_util.h | 4 +- .../selftests/kvm/lib/aarch64/gic_v3_its.c | 24 +- 7 files changed, 713 insertions(+), 41 deletions(-) create mode 100644 tools/testing/selftests/kvm/aarch64/vgic_its_tables.c base-commit: 59b723cd2adbac2a34fc8e12c74ae26ae45bf230 -- 2.47.0.199.ga7371fff76-goog