On 9 February 2012 22:23, Peter Maydell <peter.maydell@xxxxxxxxxx> wrote: > Ping re the VMState and variable sized arrays issue. I don't > see any consensus in this discussion for a different approach, > so should we just commit Mitsyanko's patchset? >From an IRC conversation I just had with Anthony and Juan: ===begin== 14:51 < pm215> quintela: do you have an opinion on the vmstate variable- length array stuff (needed for sd card) ? 14:51 < quintela> pm havent looked, email title? 14:52 < pm215> "KVM call agenda for tuesday 31" is the most recent discussion :-) 14:53 < pm215> http://patchwork.ozlabs.org/patch/137732/ and http://patchwork.ozlabs.org/patch/137733/ are the relevant patches 14:54 < quintela> pm215: found it, that it is a difficult thing to do (TM) 14:54 < quintela> it should be on the "card" file, or whatever :-( 14:55 < quintela> notice the "should" part. 14:55 < pm215> I'm not sure what you mean, can you elaborate? 14:57 < quintela> pm215: sect is number of sectors, right? 14:58 < pm215> yes 14:59 < quintela> so, a 1GB card would have around 8MB array? 14:59 < quintela> took or left some byties here andthere. 14:59 < quintela> bytes indeed. 14:59 < quintela> I _think_ that we should put that in a save_live section, but that is just me (TM) 15:00 < quintela> I guess that at some point, people are going to need bigger SD cards (16GB are already on the wild, right) 15:00 < quintela> and that would make live migration just impossible 15:00 < quintela> or very slow, that is completely equivalent. 15:01 < quintela> it is my understanding that AHCI is using similar code, or did I missread some of the information? 15:03 < pm215> I think alex would like ahci to use a similar 'variable length array' thing, but in that case the array is much smaller 15:03 < aliguori> pm215, it's large but of bounded size, right? 15:03 < aliguori> for SD cards? 15:03 < quintela> aliguori: number of sectors * 4 bytes 15:03 < quintela> aliguori: so hugeeeeeeeeeeeeeeeeee 15:04 < quintela> 8MB array for each 1GB of disk. 15:04 < quintela> but or take some bytes. 15:04 < aliguori> quintela, you cannot save that much data via savevm 15:04 < quintela> this is more than all other devices combined in a normal instalaltion. 15:04 < aliguori> that's too much 15:04 < quintela> aliguori: see my answer, we need a save_live section, really. 15:04 < aliguori> it will screw up the live migration downtime algorithm 15:04 < aliguori> pm215, ^ 15:04 < aliguori> or just treat it as a ram section 15:05 < aliguori> qemu_ram_alloc() it, and call it a day 15:05 < pm215> the spec isn't very clear, but I think technically this info should go in the sd card image, except there's no way to tack additional info into a raw file 15:05 < aliguori> pm215, yeah, qemu_ram_alloc() is the way to go I think, that makes it effectively volatile on-card RAM 15:05 < quintela> pm215: I fully agree that it should go into the card image, but ..... no space for it :-( 15:05 < aliguori> which i think makes sense 15:05 < quintela> pm215: another thing, forgetting about migration at all. 15:06 < quintela> how does this work if you stop marchine and restart it again? 15:06 * quintela guess that it is stored somewhere? 15:06 < quintela> s/marchine/machine/ 15:06 < pm215> no, we just assume that any fresh sd card image has no write-protect set up 15:07 < quintela> pm215: what is stored on that image? /me would have assumed that wearing information 15:07 < quintela> but that is without reading the whole code. 15:09 < quintela> humm, it looks like only 1 bit is used for each sector, why are we storing 32 bits if we only use 1 bit? 15:09 < pm215> it's write-protect : you can set parts of the sd card to not be writeable (with a granularity of a "write-protect group size") 15:09 < pm215> we probably don't implement fantastically efficiently 15:10 < quintela> pm215: ok, only 1 bit is needed. 15:10 < quintela> we can move to 1bit/sector (8 times smaller) 15:10 < quintela> but I still think that doing the qemu_ram_alloc() trick that aliguori pointed is the easiest way to fix it 15:11 < quintela> you can create a ram_save_live section, but it is going to be more complex for almost no gain 15:11 < pm215> ah, so we qemu_ram_alloc() it and then the contents get transferred in the same way as main memory ? 15:12 < pm215> ...that is in exec-obsolete.h and marked as "to be removed soon"... 15:13 < aliguori> pm215, yeah, so you'll need to create it using whatever the new fancy memory api interface is 15:13 < aliguori> pm215, note that whenever you touch that memory, you have to set the dirty bits appropriately 15:13 < aliguori> or else live migration won't work 15:14 < quintela> aliguori: if they have to touch the dirty bits, it is equivalent to do their own save_live section. 15:14 < quintela> but I agree that this is the only "easy" solution. 15:17 < pm215> doesn't sound too hard... 15:18 < quintela> as usual with vmstate, problem is testing (althought shouldn't be very difficult, either) ===endit=== Short summary: * switch wp groups to bitfield rather than int array * convert sd.c to use memory_region_init_ram() to allocate the wp groups (being careful to use memory_region_set_dirty() when we touch them) * we don't need variable-length fields for sd.c any more * rest of the vmstate conversion is straightforward -- PMM -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html