On 2021-03-19 17:26, John Garry wrote:
[...]
@@ -25,7 +25,8 @@ struct iova {
struct iova_magazine;
struct iova_cpu_rcache;
-#define IOVA_RANGE_CACHE_MAX_SIZE 6 /* log of max cached IOVA
range size (in pages) */
+#define IOVA_RANGE_CACHE_DEFAULT_SIZE 6
+#define IOVA_RANGE_CACHE_MAX_SIZE 10 /* log of max cached IOVA range
size (in pages) */
No.
And why? If we're going to allocate massive caches anyway, whatever is
the point of *not* using all of them?
I wanted to keep the same effective threshold for devices today, unless
set otherwise.
The reason is that I didn't know if a blanket increase would cause
regressions, and I was taking the super-safe road. Specifically some
systems may be very IOVA space limited, and just work today by not
caching large IOVAs.
alloc_iova_fast() will already clear out the caches if space is running
low, so the caching itself shouldn't be an issue.
And in the precursor thread you wrote "We can't arbitrarily *increase*
the scope of caching once a domain is active due to the size-rounding-up
requirement, which would be prohibitive to larger allocations if applied
universally" (sorry for quoting)
I took the last part to mean that we shouldn't apply this increase in
threshold globally.
I meant we can't increase the caching threshold as-is once the domain is
in use, because that could result in odd-sized IOVAs previously
allocated above the old threshold being later freed back into caches,
then causing havoc the next time they get allocated (because they're not
as big as the actual size being asked for). However, trying to address
that by just size-aligning everything even above the caching threshold
is liable to waste too much space on IOVA-constrained systems (e.g. a
single 4K video frame may be ~35MB - rounding that up to 64MB each time
would be hard to justify).
It follows from that that there's really no point in decoupling the
rounding-up threshold from the actual caching threshold - you get all
the wastage (both IOVA space and actual memory for the cache arrays) for
no obvious benefit.
It only makes sense for a configuration knob to affect the actual
rcache and depot allocations - that way, big high-throughput systems
with plenty of memory can spend it on better performance, while small
systems - that often need IOMMU scatter-gather precisely *because*
memory is tight and thus easily fragmented - don't have to pay the
(not insignificant) cost for caches they don't need.
So do you suggest to just make IOVA_RANGE_CACHE_MAX_SIZE a kconfig option?
Again, I'm less convinced by Kconfig since I imagine many people tuning
server-class systems for their own particular workloads will be running
standard enterprise distros, so I think end-user-accessible knobs will
be the most valuable. That's not to say that a Kconfig option to set the
default state of a command-line option (as we do elsewhere) won't be
useful for embedded users, cloud providers, etc., just that I'm not sure
it's worth it being the *only* option.
Robin.