Hello Marc,
On 1/5/2023 4:08 PM, Marc Orr wrote:
On Tue, Dec 27, 2022 at 1:49 PM Kalra, Ashish <ashish.kalra@xxxxxxx> wrote:
Hello Boris,
On 12/19/2022 2:08 PM, Borislav Petkov wrote:
On Mon, Dec 19, 2022 at 09:00:26AM -0600, Michael Roth wrote:
We implemented this approach for v7, but it causes a fairly significant
performance regression, particularly for the case for npages > 1 which
this change was meant to optimize.
I still need to dig in a big but I'm guessing it's related to flushing
behavior.
Well, AFAICT, change_page_attr_set_clr() flushes once at the end.
Don't you need to flush when you modify the direct map?
Milan onward, there is H/W support for coherency between mappings of the
same physical page with different encryption keys, so AFAIK, there
should be no need to flush during page state transitions, where we
invoke these direct map interface functions for re-mapping/invalidating
pages.
I don't know if there is any other reason to flush after modifying
the direct map ?
Isn't the Milan coherence feature (SME_COHERENT?) about the caches --
not the TLBs? And isn't the flushing being discussed here about the
TLBs?
Actually, the flush does both cache and TLB flushing.
Both cpa_flush() and cpa_flush_all() will also do cache flushing if
cache argument is not NULL. As in this case, no page caching attributes
are being changed so there is no need to do cache flushing.
But TLB flushing (as PTE is updated) is still required and will be done.
Also, I thought that Mingwei Zhang <mizhang@xxxxxxxxxx> found that the
Milan SEV coherence feature was basically unusable in Linux because it
only works across CPUs. It does not extend to IO (e.g., CPU caches
need to be flushed prior to free'ing a SEV VM's private address and
reallocating that location to a device driver to be used for IO). My
understanding of this feature and its limitations may be too coarse.
But I think we should be very careful about relying on this feature as
it is implemented in Milan.
That being said, I guess I could see an argument to rely on the
feature here, since we're not deallocating the memory and reallocating
it to a device. But again, I thought the feature was about cache
coherence -- not TLB coherence.
Yes, this is just invalidating or re-mapping into the kernel direct map,
so we can rely on this feature for the use case here.
Thanks,
Ashish