On Sun, Dec 24, 2023 at 12:06:57AM -0500, Ethan Zhao wrote: > --- a/drivers/iommu/intel/dmar.c > +++ b/drivers/iommu/intel/dmar.c > @@ -1423,6 +1423,13 @@ int qi_submit_sync(struct intel_iommu *iommu, struct qi_desc *desc, > writel(qi->free_head << shift, iommu->reg + DMAR_IQT_REG); > > while (qi->desc_status[wait_index] != QI_DONE) { > + /* > + * if the devTLB invalidation target device is gone, don't wait > + * anymore, it might take up to 1min+50%, causes system hang. > + */ > + if (type == QI_DIOTLB_TYPE && iommu->flush_target_dev) > + if (!pci_device_is_present(to_pci_dev(iommu->flush_target_dev))) > + break; As a general approach, this is much better now. Please combine the nested if-clauses into one. Please amend the code comment with a spec reference, i.e. "(see Implementation Note in PCIe r6.1 sec 10.3.1)" so that readers of the code know where the magic number "1min+50%" is coming from. Is flush_target_dev guaranteed to always be a pci_dev? I'll let iommu maintainers comment on whether storing a flush_target_dev pointer is the right approach. (May store a back pointer from struct intel_iommu to struct device_domain_info?) Maybe move the "to_pci_dev(iommu->flush_target_dev)" lookup outside the loop to avoid doing this over and over again? I think we still have a problem here if the device is not removed but simply takes a long time to respond to Invalidate Requests (as it is permitted to do per the Implementation Note). We'll busy-wait for the completion and potentially run into the watchdog's time limit again. So I think you or someone else in your org should add OKRs to refactor the code so that it sleeps in-between polling for Invalidate Completions (instead of busy-waiting with interrupts disabled). Thanks, Lukas