On 12/5/19 4:53 PM, Eric Blake wrote: > On 12/5/19 2:16 PM, John Snow wrote: > >>>> Last minute edit: hmm, actually, transaction action introduced in >>>> 4.2, so crash is not a regression, only broken >>>> block-dirty-bitmap-remove >>>> command is a regression... Maybe it's OK for stable. >>> >>> Libvirt REALLY wants to use transaction bitmap management (and require >>> qemu 4.2) for its incremental backup patches that Peter is almost ready >>> to merge in. I'm trying to ascertain: >>> >>> When exactly can this bug hit? Can you give a sequence of QMP commands >>> that will trigger it for easy reproduction? Are there workarounds (such >>> as checking that a bitmap exists prior to attempting to remove it)? If >>> it does NOT get fixed in rc5, how will libvirt be able to probe whether >>> the fix is in place? >>> >> >> It looks like: >> >> - You need to have at least one bitmap >> - You need to use transactional remove >> - you need to misspell the bitmap name >> - The remove will fail (return -EINVAL) but doesn't set errp >> - The caller chokes on incomplete information, state->bitmap is NULL > > So in libvirt's case, as long as libvirt manages bitmaps completely, > it's a bug in libvirt to request deletion of a bitmap that doesn't > exist. Or, someone messes with a qcow2 image of an offline guest behind > libvirt's back without updating libvirt's metadata of what bitmaps > should exist, and then if libvirt fails to check that a bitmap actually > exists, a user may be able to coerce libvirt into requesting a bitmap > deletion that will cause a qemu crash, but that's the user's fault for > going behind libvirt's back. Or, libvirt could add code that instead of > trying to blindly delete a bitmap, it first makes a QMP call to ensure > the bitmap still exists (annoying, but harmless even when the bug is > fixed), instead of blaming the bug on the user operating behind > libvirt's back. > > The bug is nasty, but feels to be enough of a corner case that I think > deferring to 5.0 with CC: stable (and then downstreams can backport it > at will) is the right approach; no need to hold up 4.2 if this is the > only flaw. But I'm also not opposed to it going in 4.2 if we have > anything else serious. > Further, the NASTIEST problem is with transactional remove, which is new to 4.2. Normal remove is also broken, but won't choke because it doesn't hold undo information. Vladimir, do you agree with this assessment? Do we have it correct? --js -- libvir-list mailing list libvir-list@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/libvir-list