On 02/21/2012 07:33 PM, Peter Maydell wrote:
On 9 February 2012 22:23, Peter Maydell<peter.maydell@xxxxxxxxxx> wrote:
Ping re the VMState and variable sized arrays issue. I don't
see any consensus in this discussion for a different approach,
so should we just commit Mitsyanko's patchset?
From an IRC conversation I just had with Anthony and Juan:
===begin==
14:51< pm215> quintela: do you have an opinion on the vmstate variable-
length array stuff (needed for sd card) ?
14:51< quintela> pm havent looked, email title?
14:52< pm215> "KVM call agenda for tuesday 31" is the most recent
discussion :-)
14:53< pm215> http://patchwork.ozlabs.org/patch/137732/ and
http://patchwork.ozlabs.org/patch/137733/ are the relevant patches
14:54< quintela> pm215: found it, that it is a difficult thing to do (TM)
14:54< quintela> it should be on the "card" file, or whatever :-(
14:55< quintela> notice the "should" part.
14:55< pm215> I'm not sure what you mean, can you elaborate?
14:57< quintela> pm215: sect is number of sectors, right?
14:58< pm215> yes
14:59< quintela> so, a 1GB card would have around 8MB array?
14:59< quintela> took or left some byties here andthere.
14:59< quintela> bytes indeed.
14:59< quintela> I _think_ that we should put that in a save_live
section, but that is just me (TM)
15:00< quintela> I guess that at some point, people are going to need
bigger SD cards (16GB are already on the wild, right)
15:00< quintela> and that would make live migration just impossible
15:00< quintela> or very slow, that is completely equivalent.
15:01< quintela> it is my understanding that AHCI is using similar code,
or did I missread some of the information?
15:03< pm215> I think alex would like ahci to use a similar 'variable
length array' thing, but in that case the array is much smaller
15:03< aliguori> pm215, it's large but of bounded size, right?
15:03< aliguori> for SD cards?
15:03< quintela> aliguori: number of sectors * 4 bytes
15:03< quintela> aliguori: so hugeeeeeeeeeeeeeeeeee
15:04< quintela> 8MB array for each 1GB of disk.
15:04< quintela> but or take some bytes.
15:04< aliguori> quintela, you cannot save that much data via savevm
15:04< quintela> this is more than all other devices combined in a
normal instalaltion.
15:04< aliguori> that's too much
15:04< quintela> aliguori: see my answer, we need a save_live section,
really.
15:04< aliguori> it will screw up the live migration downtime algorithm
15:04< aliguori> pm215, ^
15:04< aliguori> or just treat it as a ram section
15:05< aliguori> qemu_ram_alloc() it, and call it a day
15:05< pm215> the spec isn't very clear, but I think technically this
info should go in the sd card image, except there's no way to tack
additional info into a raw file
15:05< aliguori> pm215, yeah, qemu_ram_alloc() is the way to go I think,
that makes it effectively volatile on-card RAM
15:05< quintela> pm215: I fully agree that it should go into the card
image, but ..... no space for it :-(
15:05< aliguori> which i think makes sense
15:05< quintela> pm215: another thing, forgetting about migration at all.
15:06< quintela> how does this work if you stop marchine and restart
it again?
15:06 * quintela guess that it is stored somewhere?
15:06< quintela> s/marchine/machine/
15:06< pm215> no, we just assume that any fresh sd card image has no
write-protect set up
15:07< quintela> pm215: what is stored on that image? /me would have
assumed that wearing information
15:07< quintela> but that is without reading the whole code.
15:09< quintela> humm, it looks like only 1 bit is used for each sector,
why are we storing 32 bits if we only use 1 bit?
15:09< pm215> it's write-protect : you can set parts of the sd card to
not be writeable (with a granularity of a "write-protect group size")
15:09< pm215> we probably don't implement fantastically efficiently
15:10< quintela> pm215: ok, only 1 bit is needed.
15:10< quintela> we can move to 1bit/sector (8 times smaller)
15:10< quintela> but I still think that doing the qemu_ram_alloc()
trick that aliguori pointed is the easiest way to fix it
15:11< quintela> you can create a ram_save_live section, but it is
going to be more complex for almost no gain
15:11< pm215> ah, so we qemu_ram_alloc() it and then the contents get
transferred in the same way as main memory ?
15:12< pm215> ...that is in exec-obsolete.h and marked as "to be
removed soon"...
15:13< aliguori> pm215, yeah, so you'll need to create it using
whatever the new fancy memory api interface is
15:13< aliguori> pm215, note that whenever you touch that memory, you
have to set the dirty bits appropriately
15:13< aliguori> or else live migration won't work
15:14< quintela> aliguori: if they have to touch the dirty bits, it is
equivalent to do their own save_live section.
15:14< quintela> but I agree that this is the only "easy" solution.
15:17< pm215> doesn't sound too hard...
15:18< quintela> as usual with vmstate, problem is testing (althought
shouldn't be very difficult, either)
===endit===
Short summary:
* switch wp groups to bitfield rather than int array
* convert sd.c to use memory_region_init_ram() to allocate the wp groups
(being careful to use memory_region_set_dirty() when we touch them)
* we don't need variable-length fields for sd.c any more
* rest of the vmstate conversion is straightforward
-- PMM
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
OK, it turned out to be not so simple, we can't use memory API in sd.c
because TARGET_PHYS_ADDR_BITS value (and, consequently,
target_phys_addr_t) is not defined for common objects.
--
Mitsyanko Igor
ASWG, Moscow R&D center, Samsung Electronics
email: i.mitsyanko@xxxxxxxxxxx
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html