On 2025-02-05 4:10 pm, Jason Gunthorpe wrote:
On Wed, Feb 05, 2025 at 03:47:03PM +0000, Robin Murphy wrote:
On 2025-02-04 6:34 pm, Jason Gunthorpe wrote:
Convert most of the places calling get_order() as an argument to the
iommu-pages allocator into order_base_2() or the _sz flavour
instead. These places already have an exact size, there is no particular
reason to use order here.
Signed-off-by: Jason Gunthorpe <jgg@xxxxxxxxxx>
---
[...]
@@ -826,7 +825,7 @@ void *__init iommu_alloc_4k_pages(struct amd_iommu *iommu, gfp_t gfp,
size_t size)
{
int order = get_order(size);
- void *buf = iommu_alloc_pages(gfp, order);
+ void *buf = iommu_alloc_pages_lg2(gfp, order + PAGE_SHIFT);
This is a size, really - it's right there above.
I didn't want to make major surgery to this thing, but yes it
could be:
void *__init iommu_alloc_4k_pages(struct amd_iommu *iommu, gfp_t gfp,
size_t size)
{
void *buf;
size = PAGE_ALIGN(size);
buf = iommu_alloc_pages_sz(gfp, size);
if (buf &&
check_feature(FEATURE_SNP) &&
set_memory_4k((unsigned long)buf, size / PAGE_SIZE )) {
iommu_free_page(buf);
buf = NULL;
}
return buf;
}
(although alloc_cwwb_sem() passing 1 looks highly suspicious - judging by
other cmd_sem references that probably should be PAGE_SIZE...)
Indeed, amd folks?
if (buf &&
check_feature(FEATURE_SNP) &&
[...]
@@ -1702,8 +1701,10 @@ int dmar_enable_qi(struct intel_iommu *iommu)
* Need two pages to accommodate 256 descriptors of 256 bits each
* if the remapping hardware supports scalable mode translation.
*/
- order = ecap_smts(iommu->ecap) ? 1 : 0;
- desc = iommu_alloc_pages_node(iommu->node, GFP_ATOMIC, order);
+ desc = iommu_alloc_pages_node_lg2(iommu->node, GFP_ATOMIC,
+ ecap_smts(iommu->ecap) ?
+ order_base_2(SZ_8K) :
+ order_base_2(SZ_4K));
These are also clearly sizes.
I didn't make a size wrapper version of the _node_ variation because
there are only three callers.
I don't see any need to have the log2 stuff at all, I think we just
switch iommu_alloc_pages{_node}() to take a size and keep things
simple.
Ok it is easy to remove lg2 calls from the drivers, but I would keep
the internal function like this because most of the size callers have
constants and the order_base_2() will become a constexpr when
inlined. Only a few places are not like that.
But what's the benefit of having extra stuff capable of turning a
constant into a different constant that doesn't represent anything we
actually want? We still end up doing more runtime arithmetic on lg2sz
within the allocation function itself to turn it into the order we
ultimately still need, so that arithmetic could just as well be
get_order(size) and have nothing to inline at all (other than NUMA_NO_NODE).
Thanks,
Robin.