Re: [PATCH v2 09/12] KVM: arm64: Split huge pages when dirty logging is enabled

Gavin Shan <gshan@xxxxxxxxxx> · Fri, 10 Feb 2023 10:09:40 +1100

Hi Ricardo,

On 2/9/23 11:50 PM, Ricardo Koller wrote:
On Wed, Feb 8, 2023 at 10:26 PM Gavin Shan <gshan@xxxxxxxxxx> wrote:

[...]

+static int kvm_mmu_split_huge_pages(struct kvm *kvm, phys_addr_t addr,
+                                 phys_addr_t end)
+{
+     struct kvm_mmu_memory_cache *cache;
+     struct kvm_pgtable *pgt;
+     int ret;
+     u64 next;
+     u64 chunk_size = kvm->arch.mmu.split_page_chunk_size;
+     int cache_capacity = kvm_mmu_split_nr_page_tables(chunk_size);
+
+     if (chunk_size == 0)
+             return 0;
+
+     lockdep_assert_held_write(&kvm->mmu_lock);
+
+     cache = &kvm->arch.mmu.split_page_cache;
+
+     do {
+             if (need_topup_split_page_cache_or_resched(kvm,
+                                                        cache_capacity)) {
+                     write_unlock(&kvm->mmu_lock);
+                     cond_resched();
+                     /* Eager page splitting is best-effort. */
+                     ret = __kvm_mmu_topup_memory_cache(cache,
+                                                        cache_capacity,
+                                                        cache_capacity);
+                     write_lock(&kvm->mmu_lock);
+                     if (ret)
+                             break;
+             }
+
+             pgt = kvm->arch.mmu.pgt;
+             if (!pgt)
+                     return -EINVAL;

I don't think the check to see @pgt is existing or not because the VM can't be
created with its page-table isn't allocated and set in kvm_init_stage2_mmu().

GIven that the lock is released/acquired every chunk, the intent was to check
that the page-table wasn't freed in between.

I don't understand how it can be possible. @pgt is free'd when the VM is released
when its reference count reaches zero. The major cross-point is close(vm-fd) and
this ioctl(). The VM file's release function won't be invoked until the file's
reference count is dropped to zero. The ioctl() already had one reference count
on the VM file taken to avoid it.

There may be other cases I missed. If so, I think a comment is still needed to
help reader to understand.

Thanks,
Gavin