Marcelo Tosatti <mtosatti@xxxxxxxxxx> wrote: > > + spin_lock(&kvm->mmu_lock); > > It is not clear why mmu_lock is needed. Dropping it across the xchg loop > should be similar to srcu implementation, in that concurrent updates > will be visible only on the next get_dirty call? Well, it is necessary > anyway for write protecting the sptes. My implementation does write protection inside the xchg loop. Then, after that loop, flushes TLB. mmu_lock must protect both of these together. If we do not mind scanning the bitmap twice, we can decouple the xchg loop and write protection, but it will be a bit slower, and in any case we need to hold mmu_lock until TLB is flushed. As can be seen from the unit-test result the majority of time is being spent on write protecting sptes, so decoupling xchg loop alone will not alleviate the problem so much -- my guess. > A cond_resched_lock() would alleviate the potentially long held > times for mmu_lock (can you measure it with large memslots?) How to move TLB flush out of mmu_lock critical sections was discussed before, and there seemed to be some proposals. Anyone is working on that? After that we can do many things. One idea is to make the extra bitmap buffer size shrink to one page or so and do xchg and write protection loop by that limited size. Because we can drop mmu_lock, it is possible to copy_to_user part of the dirty bitmap, and then go to the next part. After everything is protected, we can then do TLB flush after dropping mmu_lock. > Otherwise looks nice. Thanks, Takuya > > - r = -ENOMEM; > > - slots = kmemdup(kvm->memslots, sizeof(*kvm->memslots), GFP_KERNEL); > > - if (!slots) > > - goto out; > > + for (i = 0; i < n / sizeof(long); i++) { > > + unsigned long mask; > > + gfn_t offset; > > > > - memslot = id_to_memslot(slots, log->slot); > > - memslot->nr_dirty_pages = 0; > > - memslot->dirty_bitmap = dirty_bitmap_head; > > - update_memslots(slots, NULL); > > + if (!dirty_bitmap[i]) > > + continue; > > > > - old_slots = kvm->memslots; > > - rcu_assign_pointer(kvm->memslots, slots); > > - synchronize_srcu_expedited(&kvm->srcu); > > - kfree(old_slots); > > + is_dirty = true; > > > > - write_protect_slot(kvm, memslot, dirty_bitmap, nr_dirty_pages); > > + mask = xchg(&dirty_bitmap[i], 0); > > + dirty_bitmap_buffer[i] = mask; > > > > - r = -EFAULT; > > - if (copy_to_user(log->dirty_bitmap, dirty_bitmap, n)) > > - goto out; > > - } else { > > - r = -EFAULT; > > - if (clear_user(log->dirty_bitmap, n)) > > - goto out; > > + offset = i * BITS_PER_LONG; > > + kvm_mmu_write_protect_pt_masked(kvm, memslot, offset, mask); > > } > > + if (is_dirty) > > + kvm_flush_remote_tlbs(kvm); > > + > > + spin_unlock(&kvm->mmu_lock); -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html