Re: [RFC PATCH 3/8] kvm: pfncache: enlighten about gmem

Patrick Roy <roypat@xxxxxxxxxxxx> · Wed, 10 Jul 2024 10:49:56 +0100

On 7/9/24 15:36, David Woodhouse wrote:
> On Tue, 2024-07-09 at 14:20 +0100, Patrick Roy wrote:
>> KVM uses gfn_to_pfn_caches to cache translations from gfn all the way to
>> the pfn (for example, kvm-clock caches the page storing the page used
>> for guest/host communication this way). Unlike the gfn_to_hva_cache,
>> where no equivalent caching semantics were possible to gmem-backed gfns
>> (see also 858e8068a750 ("kvm: pfncache: enlighten about gmem")), here it
>> is possible to simply cache the pfn returned by `kvm_gmem_get_pfn`.
>>
>> Additionally, gfn_to_pfn_caches now invalidate whenever a cached gfn's
>> attributes are flipped from shared to private (or vice-versa).
>>
>> Signed-off-by: Patrick Roy <roypat@xxxxxxxxxxxx>
> 
> I can't see how this is safe from race conditions.
> 
> When the GPC is invalidated from gfn_to_pfn_cache_invalidate_start()
> its *write* lock is taken and gpc->valid is set to false.
> 
> In parallel, any code using the GPC to access guest memory will take
> the *read* lock, call kvm_gpc_check(), and then go ahead and use the
> pointer to its heart's content until eventually dropping the read lock.
> 
> Since invalidation takes the write lock, it has to wait until the GPC
> is no longer in active use, and the pointer cannot be being
> dereferenced.
> 
> How does this work for the kvm_mem_is_private() check. You've added a
> check in kvm_gpc_check(), but what if the pfn is made private
> immediately *after* that check? Unless the code path which makes the
> pfn private also takes the write lock, how is it safe?

Ah, you're right - I did in fact overlook this. I do think that it works
out though: kvm_vm_set_mem_attributes, which is used for flipping
between shared/private, registers the range which had its attributes
changed for invalidation, and thus gfn_to_pfn_cache_invalidate_start
should get called for it (although I have to admit I do not immediately
see what the exact callstack for this looks like, so maybe I am
misunderstanding something about invalidation here?).