On 10/16/2015 04:50 AM, Ilya Dryomov wrote: > On Thu, Oct 15, 2015 at 11:10 PM, Alex Elder <elder@xxxxxxxx> wrote: >> On 10/11/2015 01:03 PM, Ilya Dryomov wrote: >>> Currently we leak parent_spec and trigger a "parent reference >>> underflow" warning if rbd_dev_create() in rbd_dev_probe_parent() fails. >>> The problem is we take the !parent out_err branch and that only drops >>> refcounts; parent_spec that would've been freed had we called >>> rbd_dev_unparent() remains and triggers rbd_warn() in >>> rbd_dev_parent_put() - at that point we have parent_spec != NULL and >>> parent_ref == 0, so counter ends up being -1 after the decrement. >> >> OK, now that I understand the context... >> >> You now get extra references for the spec and client >> for the parent only after creating the parent device. >> I think the reason they logically belonged before >> the call to rbd_device_create() for the parent is >> because the client and spec pointers passed to that >> function carry with them references that are passed >> to the resulting rbd_device if successful. > > Let me stress that those two get()s that I moved is not the point of > the patch at all. Whether we do it before or after rbd_dev_create() is > pretty much irrelevant, the only reason I moved those calls is to make > the error path simpler. Yes I understand that. >> If creating the parent device fails, you unparent the >> original device, which will still have a null parent >> pointer. The effect of unparenting in this case is >> dropping a reference to the parent's spec, and clearing >> the device's pointer to it. This is confusing, but >> let's run with it. >> >> If creating the parent device succeeds, references to >> the client and parent spec are taken (basically, these >> belong to the just-created parent device). The parent >> image is now probed. If this fails, you again >> unparent the device. We still have not set the >> device's parent pointer, so the effect is as before, >> dropping the parent spec reference and clearing >> the device's reference to it. The error handling >> now destroys the parent, which drops references to >> its client and the spec. Again, this seems >> confusing, as if we've dropped one more reference >> to the parent spec than desired. >> >> This logic now seems to work. But it's working >> around a flaw in the caller I think. Upon entry >> to rbd_dev_probe_parent(), a layered device will >> have have a non-null parent_spec pointer (and a >> reference to it), which will have been filled in >> by rbd_dev_v2_parent_info(). >> >> Really, it should not be rbd_dev_probe_parent() >> that drops that parent spec reference on error. >> Instead, rbd_dev_image_probe() (which got the >> reference to the parent spec) should handle >> cleaning up the device's parent spec if an >> error occurs after it has been assigned. >> >> I'll wait for your response, I'd like to know >> if what I'm saying makes sense. > > All of the above makes sense. I agree that it is asymmetrical and that > it would have been better to have rbd_dev_image_probe() drop the last > ref on ->parent_spec. But, that's not what all of the existing code is > doing. Agreed. > Consider rbd_dev_probe_parent() without my patch. There are two > out_err jumps. As it is, if rbd_dev_create() fails, we only revert > those two get()s and return with alive ->parent_spec. However, if > rbd_dev_image_probe() fails, we call rbd_dev_unparent(), which will put > ->parent_spec. Really, the issue is rbd_dev_unparent(), which not only > removes the parent rbd_dev, but also the parent spec. All I did with > this patch was align both out_err cases to do the same, namely undo the > effects of rbd_dev_v2_parent_info() - I didn't want to create yet > another error path. OK. Walking through the code I did concluded that what you did made things "line up" properly. But I just felt like it wasn't necessarily the *right* fix, but it was a correct fix. The only point in suggesting what I think would be a better fix is that it would make things easier to reason about and get correct, and in so doing, make it easier to maintain and adapt in the future. But I have no objection to your fix, so please accept this for your original patch: Reviewed-by: Alex Elder <elder@xxxxxxxxxx> > Thanks, > > Ilya > -- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html