On Wed, May 02, 2018 at 03:57:13PM +0100, Dr. David Alan Gilbert wrote: > * Peter Xu (peterx@xxxxxxxxxx) wrote: > > On Fri, Apr 27, 2018 at 06:40:09PM +0800, Xiao Guangrong wrote: > > > > > > > > > On 04/27/2018 05:31 PM, Peter Xu wrote: > > > > On Fri, Apr 27, 2018 at 11:15:37AM +0800, Xiao Guangrong wrote: > > > > > > > > > > > > > > > On 04/26/2018 10:01 PM, Eric Blake wrote: > > > > > > On 04/26/2018 04:15 AM, guangrong.xiao@xxxxxxxxx wrote: > > > > > > > From: Xiao Guangrong <xiaoguangrong@xxxxxxxxxxx> > > > > > > > > > > > > > > QEMU 2.13 enables strict check for compression & decompression to > > > > > > > make the migration more robuster, that depends on the source to fix > > > > > > > > > > > > s/robuster/robust/ > > > > > > > > > > > > > > > > Will fix, thank you for pointing it out. > > > > > > > > > > > > the internal design which triggers the unexpected error conditions > > > > > > > > > > > > 2.13 hasn't been released yet. Why do we need a knob to explicitly turn > > > > > > off strict checking? Can we not instead make 2.13 automatically smart > > > > > > enough to tell if the incoming stream is coming from an older qemu > > > > > > (which might fail if the strict checks are enabled) vs. a newer qemu > > > > > > (the sender gave us what we need to ensure the strict checks are > > > > > > worthwhile)? > > > > > > > > > > > > > > > > Really smart. > > > > > > > > > > How about introduce a new command, MIG_CMD_DECOMPRESS_ERR_CHECK, > > > > > the destination will do strict check if got this command (i.e, new > > > > > QEMU is running on the source), otherwise, turn the check off. > > > > > > > > Why not we just introduce a compat bit for that? I mean something > > > > like: 15c3850325 ("migration: move skip_section_footers", > > > > 2017-06-28). Then we turn that check bit off for <=2.12. > > > > > > > > Would that work? > > > > > > I am afraid it can not. :( > > > > > > The compat bit only impacts local behavior, however, in this case, we > > > need the source QEMU to tell the destination if it supports strict > > > error check. > > > > My understanding is that the new compat bit will only take effect when > > at destination. > > > > I'm not sure I'm thinking that correctly. I'll give some examples. > > > > When we migrate from <2.12 to 2.13, on 2.13 QEMU we'll possibly with > > (using q35 as example, always) "-M pc-q35-2.12" to make the migration > > work, so this will let the destination QEMU stop checking > > decompressing errors. IMHO that's what we want so it's fine (forward > > migration). > > > > When we migrate from 2.13 to <2.12, on 2.12 it'll always skip checking > > decompression errors, so it's fine too even if we don't send some > > compress-errored pages. > > > > Then, would this mean that the compat bit could work too just like > > this patch? AFAIU the compat bit idea is very similar to current > > patch, however we don't really need a new parameter to make things > > complicated, we just let old QEMUs behave differently and > > automatically, then user won't need to worry about manually specify > > that parameter. > > I think you're saying just to wire it to the machine type for receive; > that would work and would be fairly simple, although wouldn't provide > the protection when going from new->new using an old machine type. Yes. But actually we can still leverage the protection even with new->new and old machine types - we just need to explicitly override that parameter on both sides (instead of explicitly disalbe that on old ones): -M pc-q35-2.12 -global migration.x-error-decompress-check=true After all the user already specified "-M pc-q35-2.12" explicitly rather than using the default 2.13 one, I would consider he/she an advanced user. Then IMHO it would be acceptable to make this explicit too when the user really wants that. (Will that happen a lot when people still use old machine types even if they are creating new VMs?) -- Peter Xu