On Fri, Sep 23, 2016 at 11:39 PM, Alex Elder <elder@xxxxxxxxxx> wrote: > On 09/19/2016 12:03 PM, Ilya Dryomov wrote: >> If stat request fails with something other than -ENOENT (which just >> means that we need to copyup), the original object request is never >> marked as done and therefore never completed. Fix this by moving the >> mark done + complete snippet from rbd_img_obj_parent_read_full() into >> rbd_img_obj_exists_callback(). The former remains covered, as the >> latter is its only caller (through rbd_img_obj_request_submit()). >> >> Signed-off-by: Ilya Dryomov <idryomov@xxxxxxxxx> > > Did you ever see this happen? No, this series was all just staring at the surroundings of rbd_img_obj_exists_submit(), prompted by David's patches. > > This little block of code (setting an error and marking a > request done) is found in two other spots in the code; maybe > it could be encapsulated. (That wouldn't have helped in > this case, however.) I'll go ahead and pull it into a new rbd_obj_request_error(). > > It took me a while to refresh on all the moving parts here. > Let me explain what I see. > > No matter what, when an object request gets submitted, it > must eventually have its result code set, get marked done, > and be completed by a call to rbd_obj_request_complete(). > > An image object gets submitted by rbd_img_obj_request_submit(). > Initially, rbd_img_request_submit() calls this for every object > request in the image request. If any errors occur here, the > image request is dropped, and in the process of cleaning up > the image request, its reference to its objects are dropped as > well. (Here it's possible some requests are never submitted, > and therefore never marked done... But I think that's OK.) > > There is one other place rbd_img_obj_request_submit() gets > called. Before getting to that, let's look at what it does. > > Simple image object requests (reads, or non-layered writes, > or layered writes that are known to not overlap the parent) > require no special treatment. The object request is submitted, > and it will eventually be marked done and be completed. > > For a layered write request, we need to know whether an image > object exists, and if we don't know we submit a stat call for > that object so we can find out. When that call completes, > rbd_img_obj_exists_callback() records the result (whether the > object exists), and then re-submits the original image object > request. This is the second spot rbd_img_obj_request_submit() > is called. If an error occurs at this stage, we need to mark > the original request done and complete it. > --> This is the location of the bug this patch fixes--the > original request was not getting marked done in the event > of an error. Correct. > > The last case is a call to rbd_img_obj_request_submit() for > a non-simple request in which we know the target image object > doesn't exist. So in that case we issue a read of the data > in the parent object backing the full (target) image object, > in rbd_img_obj_parent_read_full(). This is necessary so that > data can be supplied to the target OSD in a copyup request. > Previously, if an error occurred calling that function, the > original image object request would be marked done and completed. > Otherwise, that parent image would, when complete, cause > rbd_img_obj_parent_read_full_callback() to be called. This > patch changes that (discussed below). > > If an error occurs in rbd_img_obj_parent_read_full_callback(), > that function marks the original image object request done > and completes it. Otherwise the original request is converted > into a copyup request, and that gets submitted to the original > image object. Eventually rbd_osd_copyup_callback() is called, > which marks the original request done, and as a result, > rbd_osd_req_callback() completes that request. > > This covers all the cases. > > Now about the change... > > In in rbd_img_obj_parent_read_full() you have eliminated > setting the object request result, marking it done, and > completing it. Now that function will simply return an > error if one occurs. The only caller of that function is > rbd_img_obj_request_submit(), and as laid out above, we > know an error occurring there leads to the original image > object request being marked done and completed. > > Or rather, it's done properly provided the fix for the > bug in rbd_img_obj_exists_callback() is in place. > > So I guess I'd say this looks good, and in summary: > > Reviewed-by: Alex Elder <elder@xxxxxxxxxx> > > I have one minor suggestion below. > >> --- >> drivers/block/rbd.c | 24 +++++++++++++----------- >> 1 file changed, 13 insertions(+), 11 deletions(-) >> >> diff --git a/drivers/block/rbd.c b/drivers/block/rbd.c >> index 027e0817a118..b247200a0f28 100644 >> --- a/drivers/block/rbd.c >> +++ b/drivers/block/rbd.c >> @@ -2805,10 +2805,6 @@ out_err: >> ceph_release_page_vector(pages, page_count); >> if (parent_request) >> rbd_img_request_put(parent_request); >> - obj_request->result = result; >> - obj_request->xferred = 0; >> - obj_request_done_set(obj_request); >> - > > You should change the comment at the top of this function so > it indicates that it is no longer this function that takes > care of passing the error on to the original image object > request. Done. Thanks, Ilya -- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html