The new request code arranges to get a callback for every osd request we submit (this was not the case previously). We register a lingering object watch request for the header object for each mapped rbd image. If a connection problem occurs, the osd client will re-submit lingering requests. And each time such a request is re-submitted, its callback function will get called again. We therefore need to ensure the object request associated with the lingering osd request stays valid, and the way to do that is to have an extra reference to the lingering osd request. So when a request to initiate a watch has completed, do not drop a reference as one normally would. Instead, hold off dropping that reference until the request to tear down that watch request is done. Also, only set the rbd device's watch_request pointer after the watch request has been completed successfully, and clear the pointer once it's been torn down. Signed-off-by: Alex Elder <elder@xxxxxxxxxxx> --- drivers/block/rbd.c | 31 ++++++++++++++++++++++--------- 1 file changed, 22 insertions(+), 9 deletions(-) diff --git a/drivers/block/rbd.c b/drivers/block/rbd.c index 340773f..177ba0c 100644 --- a/drivers/block/rbd.c +++ b/drivers/block/rbd.c @@ -1716,6 +1716,7 @@ static int rbd_dev_header_watch_sync(struct rbd_device *rbd_dev, int start) &rbd_dev->watch_event); if (ret < 0) return ret; + rbd_assert(rbd_dev->watch_event != NULL); } ret = -ENOMEM; @@ -1735,32 +1736,44 @@ static int rbd_dev_header_watch_sync(struct rbd_device *rbd_dev, int start) if (!obj_request->osd_req) goto out_cancel; - if (start) { + if (start) ceph_osdc_set_request_linger(osdc, obj_request->osd_req); - rbd_dev->watch_request = obj_request; - } else { + else ceph_osdc_unregister_linger_request(osdc, rbd_dev->watch_request->osd_req); - rbd_dev->watch_request = NULL; - } ret = rbd_obj_request_submit(osdc, obj_request); if (ret) goto out_cancel; ret = rbd_obj_request_wait(obj_request); if (ret) goto out_cancel; - ret = obj_request->result; if (ret) goto out_cancel; - if (start) - goto done; /* Done if setting up the watch request */ + /* + * Since a watch request is set to linger the osd client + * will hang onto it in case it needs to be re-sent in the + * event of connection loss. If we're initiating the watch + * we therefore do *not* want to drop our reference to the + * object request now; we'll effectively transfer ownership + * of it to the osd client instead. Instead, we'll drop + * that reference when the watch request gets torn down. + */ + if (start) { + rbd_dev->watch_request = obj_request; + + return 0; + } + + /* We have successfully torn down the watch request */ + + rbd_obj_request_put(rbd_dev->watch_request); + rbd_dev->watch_request = NULL; out_cancel: /* Cancel the event if we're tearing down, or on error */ ceph_osdc_cancel_event(rbd_dev->watch_event); rbd_dev->watch_event = NULL; -done: if (obj_request) rbd_obj_request_put(obj_request); -- 1.7.9.5 -- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html