Hi, On 10/15/19 11:11 PM, Yuri Volchkov wrote:
For health monitoring, it can be useful to know if iommu is behaving as expected. DMAR faults can be an indicator that a device: - has been misconfigured, or - has experienced a hardware hiccup and replacement should be considered, or - has been issuing faults due to malicious activity Currently the only way to check if there were any DMAR faults on the host is to scan the dmesg output. However this approach is not very elegant. The information we are looking for can be wrapped out of the buffer, or masked (since it is a rate-limited print) by another device. The series adds counters for DMAR faults and exposes them via sysfs.
We now have an iommu API named iommu_register_fault_handler() to register callbacks for dmar faults. How about monitoring the dmar fault through this api so that your code could be generic and vendor agnostic? Best regards, Baolu
Yuri Volchkov (2): iommu/dmar: collect fault statistics iommu/dmar: catch early fault occurrences drivers/iommu/dmar.c | 182 ++++++++++++++++++++++++++++++++---- drivers/iommu/intel-iommu.c | 1 + drivers/pci/pci-sysfs.c | 20 ++++ include/linux/intel-iommu.h | 4 + include/linux/pci.h | 11 +++ 5 files changed, 201 insertions(+), 17 deletions(-)