On 2/23/2021 4:10 PM, Alex Williamson wrote: > On Tue, 23 Feb 2021 15:37:31 -0500 > Steven Sistare <steven.sistare@xxxxxxxxxx> wrote: > >> On 2/23/2021 12:45 PM, Alex Williamson wrote: >>> On Tue, 23 Feb 2021 08:56:36 -0500 >>> Steven Sistare <steven.sistare@xxxxxxxxxx> wrote: >>> >>>> On 2/22/2021 6:17 PM, Alex Williamson wrote: >>>>> On Mon, 22 Feb 2021 15:51:45 -0700 >>>>> Alex Williamson <alex.williamson@xxxxxxxxxx> wrote: >>>>> >>>>>> On Mon, 22 Feb 2021 17:10:43 +0300 >>>>>> Dan Carpenter <dan.carpenter@xxxxxxxxxx> wrote: >>>>>> >>>>>>> tree: https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git master >>>>>>> head: 37dfbfbdca66834bc0f64ec9b35e09ac6c8898da >>>>>>> commit: 0f53afa12baec8c00f5d1d6afb49325ada105253 [6931/12022] vfio/type1: unmap cleanup >>>>>> >>>>>> It's always the patches that claim no functional change... ;) >>>>>> >>>>>>> config: i386-randconfig-m021-20210222 (attached as .config) >>>>>>> compiler: gcc-9 (Debian 9.3.0-15) 9.3.0 >>>>>>> >>>>>>> If you fix the issue, kindly add following tag as appropriate >>>>>>> Reported-by: kernel test robot <lkp@xxxxxxxxx> >>>>>>> Reported-by: Dan Carpenter <dan.carpenter@xxxxxxxxxx> >>>>>>> >>>>>>> New smatch warnings: >>>>>>> drivers/vfio/vfio_iommu_type1.c:1093 vfio_dma_do_unmap() warn: impossible condition '(size > (~0)) => (0-u32max > u32max)' >>>>>>> >>>>>>> vim +1093 drivers/vfio/vfio_iommu_type1.c >>>>>>> >>>>>>> 73fa0d10d077d9 Alex Williamson 2012-07-31 1071 static int vfio_dma_do_unmap(struct vfio_iommu *iommu, >>>>>>> 331e33d2960c82 Kirti Wankhede 2020-05-29 1072 struct vfio_iommu_type1_dma_unmap *unmap, >>>>>>> 331e33d2960c82 Kirti Wankhede 2020-05-29 1073 struct vfio_bitmap *bitmap) >>>>>>> 73fa0d10d077d9 Alex Williamson 2012-07-31 1074 { >>>>>>> c086de818dd81c Kirti Wankhede 2016-11-17 1075 struct vfio_dma *dma, *dma_last = NULL; >>>>>>> 331e33d2960c82 Kirti Wankhede 2020-05-29 1076 size_t unmapped = 0, pgsize; >>>>>>> 0f53afa12baec8 Steve Sistare 2021-01-29 1077 int ret = -EINVAL, retries = 0; >>>>>>> 331e33d2960c82 Kirti Wankhede 2020-05-29 1078 unsigned long pgshift; >>>>>>> 0f53afa12baec8 Steve Sistare 2021-01-29 1079 dma_addr_t iova = unmap->iova; >>>>>>> 0f53afa12baec8 Steve Sistare 2021-01-29 1080 unsigned long size = unmap->size; >>>>>>> ^^^^^^^^^^^^^^^^^^ >>>>>>> >>>>>>> 73fa0d10d077d9 Alex Williamson 2012-07-31 1081 >>>>>>> cade075f265b25 Kirti Wankhede 2020-05-29 1082 mutex_lock(&iommu->lock); >>>>>>> cade075f265b25 Kirti Wankhede 2020-05-29 1083 >>>>>>> 331e33d2960c82 Kirti Wankhede 2020-05-29 1084 pgshift = __ffs(iommu->pgsize_bitmap); >>>>>>> 331e33d2960c82 Kirti Wankhede 2020-05-29 1085 pgsize = (size_t)1 << pgshift; >>>>>>> cade075f265b25 Kirti Wankhede 2020-05-29 1086 >>>>>>> 0f53afa12baec8 Steve Sistare 2021-01-29 1087 if (iova & (pgsize - 1)) >>>>>>> cade075f265b25 Kirti Wankhede 2020-05-29 1088 goto unlock; >>>>>>> cade075f265b25 Kirti Wankhede 2020-05-29 1089 >>>>>>> 0f53afa12baec8 Steve Sistare 2021-01-29 1090 if (!size || size & (pgsize - 1)) >>>>>>> cade075f265b25 Kirti Wankhede 2020-05-29 1091 goto unlock; >>>>>>> 73fa0d10d077d9 Alex Williamson 2012-07-31 1092 >>>>>>> 0f53afa12baec8 Steve Sistare 2021-01-29 @1093 if (iova + size - 1 < iova || size > SIZE_MAX) >>>>>>> >>>>>>> size is unsigned long and SIZE_MAX is ULONG_MAX so "size > SIZE_MAX" >>>>>>> does not make sense. >>>>>> >>>>>> I think it made sense before the above commit, where unmap->size is a >>>>>> __u64 and a user could provide a value that exceeds SIZE_MAX on ILP32. >>>>>> Seems like the fix is probably to use a size_t for the local variable >>>>>> and restore this test to compare (unmap->size > SIZE_MAX). Steve? >>>>> >>>>> Actually it seems like VFIO_DMA_UNMAP_FLAG_ALL doesn't work when >>>>> PHYS_ADDR_MAX != SIZE_MAX (ex. x86 PAE - I think). >>>> >>>> It seems like PAE causes problems even before VFIO_DMA_UNMAP_FLAG_ALL. >>> >>> This wouldn't surprise me, I don't know of any actual non-64bit users >>> and pure 32bit support was only lightly validated ages ago. >>> >>>> In the previous vfio_dma_do_unmap code, the u64 unmap->size would be >>>> truncated when passed to vfio_find_dma. >>> >>> We would have failed with -EINVAL before we get there due to this >>> SIZE_MAX test. I think the existing (previous) PAE interface is at >>> least self consistent; I see the mapping path also attempts to check >>> that casting map->size as size_t still matches the original value. >> >> Good point, and it also checks for vaddr and iova overflow and wrap: >> >> vfio_dma_do_map() >> if (map->size != size || map->vaddr != vaddr || map->iova != iova) >> return -EINVAL; >> if (iova + size - 1 < iova || vaddr + size - 1 < vaddr) { >> ret = -EINVAL; >> >> With that, I don't see a problem with PAE, for unmap-all or otherwise. >> We just need "u64 size" in vfio_dma_do_unmap to avoid the smatch warning. > > I'm not convinced. My understanding is that on PAE phys_addr_t is > 64-bit while size_t is 32-bit. dma_addr_t (iova above) seems to follow > phys_addr_t. That suggests to me that our {un}map.iova lives in a > 64-bit address space, but each mapping is limited to 32-bits. The OK, the "map->iova != iova" test does not help because dma_addr_t is 64-bit. My bad. So, I re-propose my fix for unmap-all from previous email. I am not keen on proposing a fix for the potential legacy bugs, vfio_find_dma() and its callers, if no one is reporting bugs and no one uses it with vfio. It has the potential for regression with no upside. - Steve > unmap-all logic only looks for a first entry to unmap in the > [0..SIZE_MAX] range. If an entry happens to exist there, we'll unmap > everything, but the user would have no requirement to have a mapping > within that range, their first mapping could exist at iova = (SIZE_MAX > + 1). So unmap-all would effectively need a special case to use > rb_first(), which mostly negates the reason we added > vfio_find_dma_first_node(). Right? Thanks, > > Alex > > >>>> For unmap, these fixes should suffice, and I would rather do this than >>>> disable the unmap-all flag for a corner case: >>>> >>>> vfio_dma_do_unmap() >>>> size_t unmapped = 0; >>>> unsigned long size = unmap->size; >>>> ==> >>>> u64 unmapped = 0; >>>> u64 size = unmap->size; >>>> >>>> static struct rb_node *vfio_find_dma_first_node( >>>> struct vfio_iommu *iommu, dma_addr_t start, size_t size) >>>> ==> >>>> static struct rb_node *vfio_find_dma_first_node( >>>> struct vfio_iommu *iommu, dma_addr_t start, u64 size) >>>> >>>> And maybe use dma_addr_t instead of u64 in the above (which is 64 bits for >>>> CONFIG_X86_PAE). >>>> >>>> However, there are other places in the existing code that need tweaking >>>> to be safe for PAE, the vfio_find_dma() size arg for one. >>> >>> Yes, it looks like the IOMMU aperture checking using vfio_find_dma() >>> could have issues where dma_addr_t > size_t. Do you want to propose a >>> patch? Thanks, >>> >>> Alex >>> >>>>> I can't say I'm >>>>> really interested in adding complexity to make it work in such a case >>>>> either. Maybe we can just not expose it, ex: >>>>> >>>>> diff --git a/drivers/vfio/vfio_iommu_type1.c b/drivers/vfio/vfio_iommu_type1.c >>>>> index ed03f3fcb07e..6b69a74b3db0 100644 >>>>> --- a/drivers/vfio/vfio_iommu_type1.c >>>>> +++ b/drivers/vfio/vfio_iommu_type1.c >>>>> @@ -1207,7 +1207,7 @@ static int vfio_dma_do_unmap(struct vfio_iommu *iommu, >>>>> int ret = -EINVAL, retries = 0; >>>>> unsigned long pgshift; >>>>> dma_addr_t iova = unmap->iova; >>>>> - unsigned long size = unmap->size; >>>>> + size_t size = unmap->size; >>>>> bool unmap_all = unmap->flags & VFIO_DMA_UNMAP_FLAG_ALL; >>>>> bool invalidate_vaddr = unmap->flags & VFIO_DMA_UNMAP_FLAG_VADDR; >>>>> struct rb_node *n, *first_n; >>>>> @@ -1228,7 +1228,7 @@ static int vfio_dma_do_unmap(struct vfio_iommu *iommu, >>>>> goto unlock; >>>>> } >>>>> >>>>> - if (iova + size - 1 < iova || size > SIZE_MAX) >>>>> + if (iova + size - 1 < iova || unmap->size > SIZE_MAX) >>>>> goto unlock; >>>>> >>>>> /* When dirty tracking is enabled, allow only min supported pgsize */ >>>>> @@ -2657,9 +2657,10 @@ static int vfio_iommu_type1_check_extension(struct vfio_iommu *iommu, >>>>> case VFIO_TYPE1_IOMMU: >>>>> case VFIO_TYPE1v2_IOMMU: >>>>> case VFIO_TYPE1_NESTING_IOMMU: >>>>> - case VFIO_UNMAP_ALL: >>>>> case VFIO_UPDATE_VADDR: >>>>> return 1; >>>>> + case VFIO_UNMAP_ALL: >>>>> + return PHYS_ADDR_MAX == SIZE_MAX ? 1 : 0; >>>>> case VFIO_DMA_CC_IOMMU: >>>>> if (!iommu) >>>>> return 0; >>>>> @@ -2868,6 +2869,10 @@ static int vfio_iommu_type1_unmap_dma(struct vfio_iommu *iommu, >>>>> VFIO_DMA_UNMAP_FLAG_VADDR))) >>>>> return -EINVAL; >>>>> >>>>> + if ((PHYS_ADDR_MAX != SIZE_MAX) && >>>>> + (unmap.flags & VFIO_DMA_UNMAP_FLAG_ALL)) >>>>> + return -EINVAL; >>>>> + >>>>> if (unmap.flags & VFIO_DMA_UNMAP_FLAG_GET_DIRTY_BITMAP) { >>>>> unsigned long pgshift; >>>>> >>>>> >>>>> >>>>> >>>>> >>>>>>> Is the " - 1" intentional on the other overflow check? As in it's okay >>>>>>> to wrap around to zero but not further than that? Sometimes this is >>>>>>> intentional but it requires more subsystem expertise than I possess. >>>>>> >>>>>> Yes, since we're dealing with a start + length we need to account for >>>>>> the -1 in the end value, otherwise the user could never unmap the last >>>>>> page of the address space. Thanks for the report! >>>>>> >>>>>> Alex >>>>>> >>>>>>> cade075f265b25 Kirti Wankhede 2020-05-29 1094 goto unlock; >>>>>>> 73fa0d10d077d9 Alex Williamson 2012-07-31 1095 >>>>>>> 331e33d2960c82 Kirti Wankhede 2020-05-29 1096 /* When dirty tracking is enabled, allow only min supported pgsize */ >>>>>>> 331e33d2960c82 Kirti Wankhede 2020-05-29 1097 if ((unmap->flags & VFIO_DMA_UNMAP_FLAG_GET_DIRTY_BITMAP) && >>>>>>> 331e33d2960c82 Kirti Wankhede 2020-05-29 1098 (!iommu->dirty_page_tracking || (bitmap->pgsize != pgsize))) { >>>>>>> 331e33d2960c82 Kirti Wankhede 2020-05-29 1099 goto unlock; >>>>>>> 331e33d2960c82 Kirti Wankhede 2020-05-29 1100 } >>>>>>> 73fa0d10d077d9 Alex Williamson 2012-07-31 1101 >>>>>>> 331e33d2960c82 Kirti Wankhede 2020-05-29 1102 WARN_ON((pgsize - 1) & PAGE_MASK); >>>>>>> 331e33d2960c82 Kirti Wankhede 2020-05-29 1103 again: >>>>>>> 1ef3e2bc04223f Alex Williamson 2014-02-26 1104 /* >>>>>>> >>>>>>> --- >>>>>>> 0-DAY CI Kernel Test Service, Intel Corporation >>>>>>> https://lists.01.org/hyperkitty/list/kbuild-all@xxxxxxxxxxxx >>>>>> >>>>> >>>> >>> >> >