On 24/12/2019 10:32, Alexey Kardashevskiy wrote: > > > On 23/12/2019 22:18, Michael Ellerman wrote: >> Alexey Kardashevskiy <aik@xxxxxxxxx> writes: >> >>> The last jump to free_exit in mm_iommu_do_alloc() happens after page >>> pointers in struct mm_iommu_table_group_mem_t were already converted to >>> physical addresses. Thus calling put_page() on these physical addresses >>> will likely crash. >>> >>> This moves the loop which calculates the pageshift and converts page >>> struct pointers to physical addresses later after the point when >>> we cannot fail; thus eliminating the need to convert pointers back. >>> >>> Fixes: eb9d7a62c386 ("powerpc/mm_iommu: Fix potential deadlock") >>> Reported-by: Jan Kara <jack@xxxxxxx> >>> Signed-off-by: Alexey Kardashevskiy <aik@xxxxxxxxx> >>> --- >>> Changes: >>> v3: >>> * move pointers conversion after the last possible failure point >>> --- >>> arch/powerpc/mm/book3s64/iommu_api.c | 39 +++++++++++++++------------- >>> 1 file changed, 21 insertions(+), 18 deletions(-) >>> >>> diff --git a/arch/powerpc/mm/book3s64/iommu_api.c b/arch/powerpc/mm/book3s64/iommu_api.c >>> index 56cc84520577..ef164851738b 100644 >>> --- a/arch/powerpc/mm/book3s64/iommu_api.c >>> +++ b/arch/powerpc/mm/book3s64/iommu_api.c >>> @@ -121,24 +121,6 @@ static long mm_iommu_do_alloc(struct mm_struct *mm, unsigned long ua, >>> goto free_exit; >>> } >>> >>> - pageshift = PAGE_SHIFT; >>> - for (i = 0; i < entries; ++i) { >>> - struct page *page = mem->hpages[i]; >>> - >>> - /* >>> - * Allow to use larger than 64k IOMMU pages. Only do that >>> - * if we are backed by hugetlb. >>> - */ >>> - if ((mem->pageshift > PAGE_SHIFT) && PageHuge(page)) >>> - pageshift = page_shift(compound_head(page)); >>> - mem->pageshift = min(mem->pageshift, pageshift); >>> - /* >>> - * We don't need struct page reference any more, switch >>> - * to physical address. >>> - */ >>> - mem->hpas[i] = page_to_pfn(page) << PAGE_SHIFT; >>> - } >>> - >>> good_exit: >>> atomic64_set(&mem->mapped, 1); >>> mem->used = 1; >>> @@ -158,6 +140,27 @@ static long mm_iommu_do_alloc(struct mm_struct *mm, unsigned long ua, >>> } >>> } >>> >>> + if (mem->dev_hpa == MM_IOMMU_TABLE_INVALID_HPA) { >> >> Couldn't you avoid testing this again ... >> >>> + /* >>> + * Allow to use larger than 64k IOMMU pages. Only do that >>> + * if we are backed by hugetlb. Skip device memory as it is not >>> + * backed with page structs. >>> + */ >>> + pageshift = PAGE_SHIFT; >>> + for (i = 0; i < entries; ++i) { >> >> ... by making this loop up to `pinned`. >> >> `pinned` is only incremented in the loop that does the GUP, and there's >> a check that pinned == entries after that loop. >> >> So when we get here we know pinned == entries, and if pinned is zero >> it's because we took the (dev_hpa != MM_IOMMU_TABLE_INVALID_HPA) case at >> the start of the function to get here. >> >> Or do you think that's too subtle to rely on? > > > I had 4 choices: > > 1. for (;i < pinned;) > > 2. if (dev_hpa == MM_IOMMU_TABLE_INVALID_HPA) (dev_hpa is a function > parameter) > > 3. if (mem->dev_hpa == MM_IOMMU_TABLE_INVALID_HPA) > > 4. if (mem->hpages) > > > The function is already ugly. 3) seemed as the most obvious way of > telling what is going on here: "we have just initialized @mem and it is > not for a device memory, lets finish the initialization". > > I could rearrange the code even more but since there is no NVLink3 > coming ever, I'd avoid changing it more than necessary. Thanks, Repost? Rework? > > >> >> cheers >> >>> + struct page *page = mem->hpages[i]; >>> + >>> + if ((mem->pageshift > PAGE_SHIFT) && PageHuge(page)) >>> + pageshift = page_shift(compound_head(page)); >>> + mem->pageshift = min(mem->pageshift, pageshift); >>> + /* >>> + * We don't need struct page reference any more, switch >>> + * to physical address. >>> + */ >>> + mem->hpas[i] = page_to_pfn(page) << PAGE_SHIFT; >>> + } >>> + } >>> + >>> list_add_rcu(&mem->next, &mm->context.iommu_group_mem_list); >>> >>> mutex_unlock(&mem_list_mutex); >>> -- >>> 2.17.1 > -- Alexey