On Tue, 30 Nov 2010 16:27:13 +0200 Avi Kivity <avi@xxxxxxxxxx> wrote: > On 11/30/2010 04:17 PM, Anthony Liguori wrote: > >> What's the problem with burning that cpu? per guest page, > >> compressing takes less than sending. Is it just an issue of qemu > >> mutex hold time? > > > > > > If you have a 512GB guest, then you have a 16MB dirty bitmap which > > ends up being an 128MB dirty bitmap in QEMU because we represent dirty > > bits with 8 bits. > > Was there not a patchset to split each bit into its own bitmap? And > then copy the kvm or qemu master bitmap into each client bitmap as it > became needed? > > > Walking 16mb (or 128mb) of memory just fine find a few pages to send > > over the wire is a big waste of CPU time. If kvm.ko used a > > multi-level table to represent dirty info, we could walk the memory > > mapping at 2MB chunks allowing us to skip a large amount of the > > comparisons. > > There's no reason to assume dirty pages would be clustered. If 0.2% of > memory were dirty, but scattered uniformly, there would be no win from > the two-level bitmap. A loss, in fact: 2MB can be represented as 512 > bits or 64 bytes, just one cache line. Any two-level thing will need more. > > We might have a more compact encoding for sparse bitmaps, like > run-length encoding. > Does anyone is profiling these dirty bitmap things? - 512GB guest is really the target? - how much cpu time can we use for these things? - how many dirty pages do we have to care? Since we are planning to do some profiling for these, taking into account Kemari, can you please share these information? > > > >>> In the short term, fixing (2) by accounting zero pages as full sized > >>> pages should "fix" the problem. > >>> > >>> In the long term, we need a new dirty bit interface from kvm.ko that > >>> uses a multi-level table. That should dramatically improve scan > >>> performance. > >> > >> Why would a multi-level table help? (or rather, please explain what > >> you mean by a multi-level table). > >> > >> Something we could do is divide memory into more slots, and polling > >> each slot when we start to scan its page range. That reduces the > >> time between sampling a page's dirtiness and sending it off, and > >> reduces the latency incurred by the sampling. There are also If we use rmap approach with one more interface, we can specify which range of dirty bitmap to get. This has the same effect to splitting into more slots. > >> non-interface-changing ways to reduce this latency, like O(1) write > >> protection, or using dirty bits instead of write protection when > >> available. IIUC, O(1) will lazily write protect pages beggining from top level? Does this have any impact other than the timing of get_dirty_log()? Thanks, Takuya > > > > BTW, we should also refactor qemu to use the kvm dirty bitmap directly > > instead of mapping it to the main dirty bitmap. > > That's what the patch set I was alluding to did. Or maybe I imagined > the whole thing. > > >>> We also need to implement live migration in a separate thread that > >>> doesn't carry qemu_mutex while it runs. > >> > >> IMO that's the biggest hit currently. > > > > Yup. That's the Correct solution to the problem. > > Then let's just Do it. > > -- > error compiling committee.c: too many arguments to function > > -- > To unsubscribe from this list: send the line "unsubscribe kvm" in > the body of a message to majordomo@xxxxxxxxxxxxxxx > More majordomo info at http://vger.kernel.org/majordomo-info.html -- Takuya Yoshikawa <yoshikawa.takuya@xxxxxxxxxxxxx> -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html