On 04/13/2012 05:25 PM, Takuya Yoshikawa wrote: > I forgot to say one important thing -- I might give you wrong impression. > > I am perfectly fine with your lock-less work. It is really nice! > > The reason I say much about O(1) is that O(1) and rmap based > GET_DIRTY_LOG have fundamentally different characteristics. > > I am thinking really seriously how to make dirty page tracking work > well with QEMU in the future. > > For example, I am thinking about multi-threaded and fine-grained > GET_DIRTY_LOG. > > If we use rmap based GET_DIRTY_LOG, we can restrict write protection to > only a selected area of one guest memory slot. > > So we may be able to make each thread process dirty pages independently > from other threads by calling GET_DIRTY_LOG for its own area. > > But I know that O(1) has its own good point. > So please wait a bit. I will write up what I am thinking or send patches. > > Anyway, I am looking forward to your lock-less work! > It will improve the current GET_DIRTY_LOG performance. > > Just to throw another idea into the mix - we can have write-protect-less dirty logging, too. Instead of write protection, drop the dirty bit, and check it again when reading the dirty log. It might look like we're accessing the spte twice here, but it's actually just once - when we check it to report for GET_DIRTY_LOG call N, we also prepare it for call N+1. This doesn't work for EPT, which lacks a dirty bit. But we can emulate it: take a free bit and call it spte.NOTDIRTY, when it is set, we also clear spte.WRITE, and teach the mmu that if it sees spte.NOTDIRTY and can just set spte.WRITE and clear spte.NOTDIRTY. Now that looks exactly like Xiao's lockless write enabling. Another note: O(1) write protection is not mutually exclusive with rmap based write protection. In GET_DIRTY_LOG, you write protect everything, and proceed to write enable on faults. When you reach the page table level, you perform the rmap check to see if you should write protect or not. With role.direct=1 the check is very cheap (and sometimes you can drop the entire page table and replace it with a large spte). -- error compiling committee.c: too many arguments to function -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html