RE: [PATCH v4 09/17] iommufd: Data structure to provide IOVA to PFN mapping

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



> From: Jason Gunthorpe <jgg@xxxxxxxxxx>
> Sent: Tuesday, November 15, 2022 2:44 AM
> 
> On Mon, Nov 14, 2022 at 07:28:47AM +0000, Tian, Kevin wrote:
> > > From: Jason Gunthorpe <jgg@xxxxxxxxxx>
> > > Sent: Tuesday, November 8, 2022 8:49 AM
> > >
> > > +
> > > +/*
> > > + * Automatically find a block of IOVA that is not being used and not
> reserved.
> > > + * Does not return a 0 IOVA even if it is valid.
> >
> > what is the problem with 0? should this be documented in uAPI?
> 
> 0 is commonly used as an errant value for uninitialized things. We
> don't automatically map it into a process mm because it can cause
> security problems if we don't trap a bogus 0/NULL pointer reference.
> 
> The same logic applies here too, the allocator should not return 0 to
> reserve it as an unmapped IOVA page to catch bugs.

CPU doesn't reference IOVA. Where do such bugs exist?

> 
> I don't think it needs to be documented

this again causes a subtle difference between automatic allocation
and fixed iova. If we really think address 0 is something related
to bug, then why is it allowed with fixed iova?

> 
> > > +		if (!__alloc_iova_check_used(&allowed_span, length,
> > > +					     iova_alignment, page_offset))
> > > +			continue;
> > > +
> > > +		interval_tree_for_each_span(&area_span, &iopt->area_itree,
> > > +					    allowed_span.start_used,
> > > +					    allowed_span.last_used) {
> > > +			if (!__alloc_iova_check_hole(&area_span, length,
> > > +						     iova_alignment,
> > > +						     page_offset))
> > > +				continue;
> > > +
> > > +			interval_tree_for_each_span(&reserved_span,
> > > +						    &iopt->reserved_itree,
> > > +						    area_span.start_used,
> > > +						    area_span.last_used) {
> > > +				if (!__alloc_iova_check_hole(
> > > +					    &reserved_span, length,
> > > +					    iova_alignment, page_offset))
> > > +					continue;
> >
> > this could be simplified by double span.
> 
> It is subtly not compatible, the double span looks for used areas.
> This is looking for a used area in the allowed_itree, a hole in the
> area_itree, and a hole in the reserved_itree.

the inner two loops can be replaced by double span, since both
are skipping used areas.

> 
> I don't think IOVA allocation should be a fast path so it is not worth
> alot of effort to micro-optimize this.

but I'm not insisting on changing them now. It's trivial.

> > > +	if (iopt->disable_large_pages)
> > > +		new_iova_alignment = PAGE_SIZE;
> > > +	else
> > > +		new_iova_alignment = 1;
> >
> > I didn't understand why we start searching alignment from a
> > smaller value when large pages is enabled. what is the
> > connection here?
> 
> 'disable_large_pages' is a tiny bit misnamed, what it really does is
> ensure that every iommu_map call is exactly PAGE_SIZE, not more (large
> pages) and not less (what this is protecting against).
> 
> So if a domain has less than PAGE_SIZE we upgrade to
> PAGE_SIZE. Otherwise we allow using the lowest possible alignment.
> 
> This allows userspace to always work in PAGE_SIZE units without fear
> of problems, eg with sub-page-size units becoming weird or something.

above are good stuff in a comment.

> 
> > > +	interval_tree_remove(&area->node, &iopt->area_itree);
> > > +	rc = iopt_insert_area(iopt, lhs, area->pages, start_iova,
> > > +			      iopt_area_start_byte(area, start_iova),
> > > +			      (new_start - 1) - start_iova + 1,
> > > +			      area->iommu_prot);
> > > +	if (WARN_ON(rc))
> > > +		goto err_insert;
> > > +
> > > +	rc = iopt_insert_area(iopt, rhs, area->pages, new_start,
> > > +			      iopt_area_start_byte(area, new_start),
> > > +			      last_iova - new_start + 1, area->iommu_prot);
> > > +	if (WARN_ON(rc))
> > > +		goto err_remove_lhs;
> > > +
> > > +	lhs->storage_domain = area->storage_domain;
> > > +	lhs->num_accesses = area->num_accesses;
> > > +	lhs->pages = area->pages;
> > > +	rhs->storage_domain = area->storage_domain;
> > > +	rhs->num_accesses = area->num_accesses;
> >
> > if an access only spans one side, is it correct to have both split sides
> > keep the access number?
> 
> Er, this is acatually completely broken, woops. A removal of an access
> will trigger a WARN_ON since the access_itree element is very likely
> no longer correct.
> 
> Ah.. So the only use case here is unmapping and you can't unmap
> something that has an access established, except in some pathalogical
> case where the access does not intersect with what is being mapped.
> 
> There is no way to tell which iopt_pages_access are connected to which
> areas, so without spending some memory this can't be fixed up. I think
> it is not a real issue as mdev plus this ancient VFIO interface is
> probably not something that exists in the real world..
> 
> +       /*
> +        * Splitting is not permitted if an access exists, we don't track enough
> +        * information to split existing accesses.
> +        */
> +       if (area->num_accesses) {
> +               rc = -EINVAL;
> +               goto err_unlock;
> +       }
> +
> @@ -1041,10 +1050,8 @@ static int iopt_area_split(struct iopt_area *area,
> unsigned long iova)
>                 goto err_remove_lhs;
> 
>         lhs->storage_domain = area->storage_domain;
> -       lhs->num_accesses = area->num_accesses;
>         lhs->pages = area->pages;
>         rhs->storage_domain = area->storage_domain;
> -       rhs->num_accesses = area->num_accesses;
>         rhs->pages = area->pages;
>         kref_get(&rhs->pages->kref);
>         kfree(area);
> 

this change makes sense to me




[Index of Archives]     [KVM ARM]     [KVM ia64]     [KVM ppc]     [Virtualization Tools]     [Spice Development]     [Libvirt]     [Libvirt Users]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite Questions]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux