Did that patch cause any issue, or is it just not needed on your system?
It fixes an hypothetical problem with the way ATS is implemented.
Maybe I actually observed it on an old software model, I don't
remember. Either way it's unlikely to go upstream but I'd like to know
if I should drop it from my tree.
Had to revert same patch "mm: notify remote TLBs when dirtying a PTE" to
avoid below crash[1]. I am not sure about the cause yet.
I have noticed this issue earlier with patch pointed here and root caused the issue as below.
It happens after vfio_mmap request from QEMU for the PCIe device and during the access of VA when
PTE access flags are updated.
kvm_mmu_notifier_change_pte() --> kvm_set_spte_hve() --> kvm_set_spte_hva() --> clean_dcache_guest_page()
The validation model doesn't have FWB capability supported.
__clean_dcache_guest_page() attempts to perform dcache flush on pcie bar address(not a valid_pfn()) through page_address(),
which doesn't have page table mapping and leads to exception.
I have worked around the issue by filtering out the request if the pfn is not valid in __clean_dcache_guest_page().
As the patch wasn't posted in the community, reverted it as well.
Thank you Krishna for sharing the analysis.
Best Regards,
Sumit Gupta