On 2014/1/8 14:14, Kai Huang wrote: > On Wed, Jan 8, 2014 at 2:01 PM, Jiang Liu <jiang.liu@xxxxxxxxxxxxxxx> wrote: >> >> >> On 2014/1/8 13:07, Kai Huang wrote: >>> On Tue, Jan 7, 2014 at 5:00 PM, Jiang Liu <jiang.liu@xxxxxxxxxxxxxxx> wrote: >>>> If static identity domain is created, IOMMU driver needs to update >>>> si_domain page table when memory hotplug event happens. Otherwise >>>> PCI device DMA operations can't access the hot-added memory regions. >>>> >>>> Signed-off-by: Jiang Liu <jiang.liu@xxxxxxxxxxxxxxx> >>>> --- >>>> drivers/iommu/intel-iommu.c | 52 ++++++++++++++++++++++++++++++++++++++++++- >>>> 1 file changed, 51 insertions(+), 1 deletion(-) >>>> >>>> diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c >>>> index 83e3ed4..35a987d 100644 >>>> --- a/drivers/iommu/intel-iommu.c >>>> +++ b/drivers/iommu/intel-iommu.c >>>> @@ -33,6 +33,7 @@ >>>> #include <linux/dmar.h> >>>> #include <linux/dma-mapping.h> >>>> #include <linux/mempool.h> >>>> +#include <linux/memory.h> >>>> #include <linux/timer.h> >>>> #include <linux/iova.h> >>>> #include <linux/iommu.h> >>>> @@ -3689,6 +3690,54 @@ static struct notifier_block device_nb = { >>>> .notifier_call = device_notifier, >>>> }; >>>> >>>> +static int intel_iommu_memory_notifier(struct notifier_block *nb, >>>> + unsigned long val, void *v) >>>> +{ >>>> + struct memory_notify *mhp = v; >>>> + unsigned long long start, end; >>>> + struct iova *iova; >>>> + >>>> + switch (val) { >>>> + case MEM_GOING_ONLINE: >>>> + start = mhp->start_pfn << PAGE_SHIFT; >>>> + end = ((mhp->start_pfn + mhp->nr_pages) << PAGE_SHIFT) - 1; >>>> + if (iommu_domain_identity_map(si_domain, start, end)) { >>>> + pr_warn("dmar: failed to build identity map for [%llx-%llx]\n", >>>> + start, end); >>>> + return NOTIFY_BAD; >>>> + } >>> >>> Better to use iommu_prepare_identity_map? For si_domain, if >>> hw_pass_through is used, there's no page table. >> Hi Kai, >> Good catch! >> Seems function iommu_prepare_identity_map() is designed to handle >> RMRRs. So how about avoiding of registering memory hotplug notifier >> if hw_pass_through is true? > > I think that's also fine :) > > Btw, I have a related question to memory hotplug but not related to > intel IOMMU specifically. For the devices use DMA remapping, suppose > the device is already using the memory that we are trying to remove, > is this case, looks we need to change the existing iova <-> pa > mappings for the pa that is in the memory range about to be removed, > and reset the mapping to different pa (iova remains the same). Does > existing code have this covered? Is there a generic IOMMU layer memory > hotplug notifier to handle memory removal? That's a big issue about how to reclaim memory in use. Current rule is that memory used by DMA won't be removed until released. > > -Kai >> >> Thanks! >> Gerry >> >>> >>>> + break; >>>> + case MEM_OFFLINE: >>>> + case MEM_CANCEL_ONLINE: >>>> + /* TODO: enhance RB-tree and IOVA code to support of splitting iova */ >>>> + iova = find_iova(&si_domain->iovad, mhp->start_pfn); >>>> + if (iova) { >>>> + unsigned long start_pfn, last_pfn; >>>> + struct dmar_drhd_unit *drhd; >>>> + struct intel_iommu *iommu; >>>> + >>>> + start_pfn = mm_to_dma_pfn(iova->pfn_lo); >>>> + last_pfn = mm_to_dma_pfn(iova->pfn_hi + 1) - 1; >>>> + dma_pte_clear_range(si_domain, start_pfn, last_pfn); >>>> + dma_pte_free_pagetable(si_domain, start_pfn, last_pfn); >>>> + rcu_read_lock(); >>>> + for_each_active_iommu(iommu, drhd) >>>> + iommu_flush_iotlb_psi(iommu, si_domain->id, >>>> + start_pfn, last_pfn - start_pfn + 1, 0); >>>> + rcu_read_unlock(); >>>> + __free_iova(&si_domain->iovad, iova); >>>> + } >>> >>> The same as above. Looks we need to consider hw_pass_through for the si_domain. >>> >>> -Kai >>> >>>> + break; >>>> + } >>>> + >>>> + return NOTIFY_OK; >>>> +} >>>> + >>>> +static struct notifier_block intel_iommu_memory_nb = { >>>> + .notifier_call = intel_iommu_memory_notifier, >>>> + .priority = 0 >>>> +}; >>>> + >>>> int __init intel_iommu_init(void) >>>> { >>>> int ret = -ENODEV; >>>> @@ -3761,8 +3810,9 @@ int __init intel_iommu_init(void) >>>> init_iommu_pm_ops(); >>>> >>>> bus_set_iommu(&pci_bus_type, &intel_iommu_ops); >>>> - >>>> bus_register_notifier(&pci_bus_type, &device_nb); >>>> + if (si_domain) >>>> + register_memory_notifier(&intel_iommu_memory_nb); >>>> >>>> intel_iommu_enabled = 1; >>>> >>>> -- >>>> 1.7.10.4 >>>> >>>> _______________________________________________ >>>> iommu mailing list >>>> iommu@xxxxxxxxxxxxxxxxxxxxxxxxxx >>>> https://lists.linuxfoundation.org/mailman/listinfo/iommu -- To unsubscribe from this list: send the line "unsubscribe linux-pci" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html