On 2011-08-01 11:45, Avi Kivity wrote: > On 08/01/2011 12:05 PM, Jan Kiszka wrote: >> On 2011-08-01 10:16, Avi Kivity wrote: >>> On 08/01/2011 10:52 AM, Jan Kiszka wrote: >>>> On 2011-08-01 09:34, Jan Kiszka wrote: >>>> > On 2011-07-31 21:47, Avi Kivity wrote: >>>> >> When a range is being unmapped, ask accelerators (e.g. kvm) to >>>> synchronize the >>>> >> dirty bitmap to avoid losing information forever. >>>> >> >>>> >> Fixes grub2 screen update. >>>> > >>>> > I does. >>>> > >>>> > But something is still broken. As I reported before, the >>>> performance of >>>> > grub2 startup is an order of magnitude slower than with the existing >>>> > code. According to ftrace, we are getting tons of additional >>>> > EPT_MISCONFIG exits over the 0xA0000 segment. But I haven't spot the >>>> > difference yet. The effective slot setup as communicated to kvm looks >>>> > innocent. >>>> >>>> I take it back: We obviously once in a while resume the guest with the >>>> vga segment unmapped. And that, of course, ends up doing mmio instead of >>>> plain ram accesses. >>>> >>> >>> qemu-kvm.git 6b5956c573 and its predecessor fix the issue (and I think >>> they're even faster than upstream, but perhaps I'm not objective). >>> >> >> Just updated to the latest memory-region branch - how did you test it? >> It does not link here due to forgotten rwhandler in Makefile.target. >> >> Anyway, that commit has no impact on the issue I'm seeing. I'm also >> carrying transaction changes for cirrus here, but they have no >> noticeable impact. That indicates that the new API is not actually slow, >> it likely just has some bug. > > Here's the log of range changes while in grub2: > > adding a0000-affff offset 40000 ram 40040000 > dropping a0000-affff > adding a0000-affff offset 30000 ram 40040000 > dropping a0000-affff > adding a0000-affff offset 40000 ram 40040000 > dropping a0000-affff > adding a0000-affff offset 30000 ram 40040000 > dropping a0000-affff > adding a0000-affff offset 40000 ram 40040000 > dropping a0000-affff > adding a0000-affff offset 30000 ram 40040000 > dropping a0000-affff > adding a0000-affff offset 40000 ram 40040000 > dropping a0000-affff > adding a0000-affff offset 30000 ram 40040000 > dropping a0000-affff > adding a0000-affff offset 40000 ram 40040000 > dropping a0000-affff > adding a0000-affff offset 30000 ram 40040000 > dropping a0000-affff > adding a0000-affff offset 40000 ram 40040000 > dropping a0000-affff > adding a0000-affff offset 30000 ram 40040000 > dropping a0000-affff > adding a0000-affff offset 40000 ram 40040000 > dropping a0000-affff > adding a0000-affff offset 30000 ram 40040000 > dropping a0000-affff > adding a0000-affff offset 40000 ram 40040000 > dropping a0000-affff > adding a0000-affff offset 30000 ram 40040000 > dropping a0000-affff > adding a0000-affff offset 40000 ram 40040000 > dropping a0000-affff > adding a0000-affff offset 30000 ram 40040000 > dropping a0000-affff > adding a0000-affff offset 40000 ram 40040000 > dropping a0000-affff > adding a0000-affff offset 30000 ram 40040000 > dropping a0000-affff > adding a0000-affff offset 40000 ram 40040000 > dropping a0000-affff > adding a0000-affff offset 20000 ram 40040000 > dropping a0000-affff > adding a0000-affff offset 30000 ram 40040000 I saw this as well and thought it should be fine. But it does not tell you what is currently active when the guest runs. > > Note that drop/add is always paired (i.e. the guest never sees an > unmapped area), and we always map the full 64k even though cirrus code > manages each 32k bank individually. It looks optimal... we're probably > not testing the same thing (either qemu or guest code). This is what my instrumentation revealed: map_linear_vram_bank 0 map 0 (actually perform the mapping) map_linear_vram_bank 1 map 1 4 a0000 0 7fe863a62000 1 (KVM_SET_USER_MEMORY_REGION) 4 a0000 10000 7fe863a72000 1 run (enter guest) map_linear_vram_bank 0 map 0 map_linear_vram_bank 1 map 1 4 a0000 0 7fe863a72000 1 4 a0000 10000 7fe863a62000 1 run map_linear_vram_bank 0 map 0 map_linear_vram_bank 1 map 1 4 a0000 0 7fe863a62000 1 run map_linear_vram_bank 0 map 0 map_linear_vram_bank 1 map 1 run So we suddenly get out of sync and enter the guest with an unmapped vram segment. I takes a long time (in number of map changes) until the region becomes mapped again. Jan -- Siemens AG, Corporate Technology, CT T DE IT 1 Corporate Competence Center Embedded Linux -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html