Hi Robin,
On 2021-06-23 00:07, Robin Murphy wrote:
On 2021-06-22 15:27, Sai Prakash Ranjan wrote:
Hi Robin,
On 2021-06-22 17:41, Robin Murphy wrote:
On 2021-06-22 08:11, Sai Prakash Ranjan wrote:
Hi Robin,
On 2021-06-21 21:15, Robin Murphy wrote:
On 2021-06-18 03:51, Sai Prakash Ranjan wrote:
Add a quirk IO_PGTABLE_QUIRK_TLB_INV_ALL to invalidate entire
context
with tlb_flush_all() callback in partial walk flush to improve
unmap
performance on select few platforms where the cost of
over-invalidation
is less than the unmap latency.
I still think this doesn't belong anywhere near io-pgtable at all.
It's a driver-internal decision how exactly it implements a
non-leaf
invalidation, and that may be more complex than a predetermined
boolean decision. For example, I've just realised for SMMUv3 we
can't
invalidate multiple levels of table at once with a range command,
since if we assume the whole thing is mapped at worst-case page
granularity we may fail to invalidate any parts which are mapped as
intermediate-level blocks. If invalidating a 1GB region (with 4KB
granule) means having to fall back to 256K non-range commands, we
may
not want to invalidate by VA then, even though doing so for a 2MB
region is still optimal.
It's also quite feasible that drivers might want to do this for
leaf
invalidations too - if you don't like issuing 512 commands to
invalidate 2MB, do you like issuing 511 commands to invalidate
2044KB?
- and at that point the logic really has to be in the driver
anyway.
Ok I will move this to tlb_flush_walk() functions in the drivers. In
the previous
v1 thread, you suggested to make the choice in
iommu_get_dma_strict() test,
I assume you meant the test in iommu_dma_init_domain() with a flag
or was it
the leaf driver(ex:arm-smmu.c) test of iommu_get_dma_strict() in
init_domain?
Yes, I meant literally inside the same condition where we currently
set "pgtbl_cfg.quirks |= IO_PGTABLE_QUIRK_NON_STRICT;" in
arm_smmu_init_domain_context().
Ok got it, thanks.
I am still a bit confused on where this flag would be? Should this
be a part
of struct iommu_domain?
Well, if you were to rewrite the config with an alternative set of
flush_ops at that point it would be implicit. For a flag, probably
either in arm_smmu_domain or arm_smmu_impl. Maybe a flag would be
less
useful than generalising straight to a "maximum number of by-VA
invalidations it's worth sending individually" threshold value?
But then we would still need some flag to make this implementation
specific (qcom specific for now) and this threshold would just be
another condition although it would have been useful if this was
generic enough.
Well, for that approach I assume we could do something like
special-case 0, or if it's a mutable per-domain value maybe just
initialise it to SIZE_MAX or whatever such that it would never be
reached in practice. Whichever way, it was meant to be implied that
anything at the domain level would still be subject to final
adjustment by the init_context hook.
Ok that should work, so I went ahead with another set of flush_ops
and posted out v3.
Thanks,
Sai
It's clear to me what overall shape and separation of responsibility
is
most logical, but beyond that I don't have a particularly strong
opinion on the exact implementation; I've just been chucking ideas
around :)
Your ideas are very informative and useful :)
Thanks,
Sai
--
QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a
member
of Code Aurora Forum, hosted by The Linux Foundation