On 12/6/19 9:36 AM, Eric Blake wrote: > [adding in Peter Maydell, as there is now potential talk of other > 4.2-worthy patches] > > On 12/6/19 4:18 AM, Vladimir Sementsov-Ogievskiy wrote: >> 05.12.2019 23:16, John Snow wrote: >>> >>> >>> On 12/5/19 3:09 PM, Eric Blake wrote: >>>> On 12/5/19 1:30 PM, Vladimir Sementsov-Ogievskiy wrote: >>>>> Here is double bug: >>>>> >>>>> First, return error but not set errp. This may lead to: >>>>> qmp block-dirty-bitmap-remove may report success when actually failed >>>>> >>>>> block-dirty-bitmap-remove used in a transaction will crash, as >>>>> qmp_transaction will think that it returned success and will cal >>>> >>>> call >>>> >>>>> block_dirty_bitmap_remove_commit which will crash, as state->bitmap is >>>>> NULL >>>>> >>>>> Second (like in anecdote), this case is not an error at all. As it is >>>>> documented in the comment above bdrv_co_remove_persistent_dirty_bitmap >>>>> definition, absence of bitmap is not an error, and similar case >>>>> handled >>>>> at start of qcow2_co_remove_persistent_dirty_bitmap, it returns 0 when >>>>> there is no bitmaps at all.. >>>> >>>> double . >>>> >>>>> >>>>> But when there are some bitmaps, but not the requested one, it return >>>>> error with errp unset. >>>>> >>>>> Fix that. >>>>> >>>>> Fixes: b56a1e31759b750 >>>>> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@xxxxxxxxxxxxx> >>>>> --- >>>>> >>>>> Hi all! >>>>> >>>>> Ohm, suddenly we faced this bug. It's a regression in 4.2. I'm very >>>>> sorry for introducing it, and it sad that it's found so late.. >>>>> >>>>> Personally, I think that this one worth rc5, as it makes new bitmap >>>>> interfaces unusable. But the decision is yours. >>>>> >>>>> Last minute edit: hmm, actually, transaction action introduced in >>>>> 4.2, so crash is not a regression, only broken >>>>> block-dirty-bitmap-remove >>>>> command is a regression... Maybe it's OK for stable. >>>> >>>> Libvirt REALLY wants to use transaction bitmap management (and require >>>> qemu 4.2) for its incremental backup patches that Peter is almost ready >>>> to merge in. I'm trying to ascertain: >>>> >>>> When exactly can this bug hit? Can you give a sequence of QMP commands >>>> that will trigger it for easy reproduction? Are there workarounds >>>> (such >>>> as checking that a bitmap exists prior to attempting to remove it)? If >>>> it does NOT get fixed in rc5, how will libvirt be able to probe whether >>>> the fix is in place? >>>> >>> >>> It looks like: >>> >>> - You need to have at least one bitmap >>> - You need to use transactional remove >>> - you need to misspell the bitmap name >>> - The remove will fail (return -EINVAL) but doesn't set errp >>> - The caller chokes on incomplete information, state->bitmap is NULL >> >> >> No, that would be too simple, the thing is worse. Absolutele correct >> removing is broken, without any misspelling >> >> Bug triggers when we are removing persistent bitmap that is not stored >> yet in the image AND at least one (another) bitmap already stored in >> the image. So, something like: >> >> 1. create persistent bitmap A >> 2. shutdown vm (bitmap A is synced) >> 3. start vm >> 4. create persistent bitmap B >> 5. remove bitmap B - it fails (and crashes if in transaction) >> >> ==== >> >> Hmm, workaround.. >> >> I'm afraid that the only thing is not remove persistent bitmaps, which >> were never synced to the image. So, instead the sequence above, we need >> >> >> 1. create persistent bitmap A >> 2. shutdown vm >> 3. start vm >> 4. create persistent bitmap B >> 5. remember, that we want to remove bitmap B after vm shutdown >> ... >> some other operations >> ... >> 6. vm shutdown >> 7. start vm in stopped mode, and remove all bitmaps marked for removing >> 8. stop vm >> >> But, I think that in real circumstances, vm shutdown is rare thing... > > This is sounding a bit more serious. As I said earlier, it shouldn't > delay 4.2 on its own, but if the fix is obvious (and other than > comments, it is a single change from 'ret = -EINVAL' to 'ret = 0' which > fixes a definite reproducible crash), I think it rises to the level of > acceptable. > > I've been so worried about the question of which release, that I don't > know if I've previously offered: > Reviewed-by: Eric Blake <eblake@xxxxxxxxxx> > Oh, that is quite a bit more serious than I thought too. Yeah, I want this in 4.2 if at all possible. Reviewed-by: John Snow <jsnow@xxxxxxxxxx> -- libvir-list mailing list libvir-list@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/libvir-list