Re: [PATCH 4/8] rbd: move bumping img_request refcount into rbd_obj_request_submit()

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, Sep 28, 2016 at 2:22 AM, Alex Elder <elder@xxxxxxxx> wrote:
> On 09/26/2016 09:38 AM, Ilya Dryomov wrote:
>> On Sun, Sep 25, 2016 at 5:56 PM, Alex Elder <elder@xxxxxxxxxx> wrote:
>>> On 09/19/2016 12:03 PM, Ilya Dryomov wrote:
>>>> Commit 0f2d5be792b0 ("rbd: use reference counts for image requests")
>>>> added rbd_img_request_get(), which rbd_img_request_fill() calls for
>>>> each obj_request added to img_request.  It was an urgent band-aid for
>>>> the uglyness that is rbd_img_obj_callback() and none of the error paths
>>>> were updated.
>>>
>>> I'm not sure how to improve that function.  Object requests in
>>> an image request need to be completed in sequential order,
>>> because blk_update_request() requires that.  (Or maybe you
>>> were referring to something else you find ugly.)
>>
>> Quoting 0f2d5be792b0 commit message:
>>
>> "There is a race here, however.  The instant an object request is
>> marked "done" it can be provided (by a thread handling completion of
>> one of its predecessor operations) to rbd_img_obj_end_request(),
>> which (for the last request) can then lead to the image request
>> getting torn down.  And this can happen *before* that object has
>> itself entered rbd_img_obj_end_request().  As a result, once it
>> *does* enter that function, the image request (and even the object
>> request itself) may have been freed and become invalid."
>>
>> "... we don't want rbd_img_request_complete() to necessarily
>> result in the image request being destroyed, because it may
>> get called before we've finished processing on all of its
>> object requests."
>>
>> This "I'm called for obj_request1, but I'll also complete obj_request2
>> along with the entire img_request if I get a chance, before I'm called
>> for obj_request2" behaviour is *really* *really* sneaky.
>
> OK, I understand that.  (And I think you meant "...before I'm
> called for obj_request1".)
>
>> I think a simple atomic, decremented for each obj_request completion
>> would do.  When it hits 0, we say "OK, this img_request is finished"
>> and call rbd_img_obj_end_request() for each obj_request in order.  I'm
>> not sure we are getting anything from calling blk_update_request() in
>> this fashion, at the expense of hard to comprehend buggy code.
>
> OK, now you're talking about something different, which is to
> change how we handle image object request completions.

Yeah, how we handle image object request completions is the root of the
problem here.  It's the reason we have rbd_img_request_get/put() in the
first place.

> Practically speaking, I think you're right that there would
> be very little lost by handling an image request's object
> request completions all at once rather than as quickly as
> possible.
>
> I think the get of the img_request in the error paths looks
> a little funny.  It'll now be encapsulated in another function,
> so maybe you can add a small comment there to say why you need
> to do that--to match what the submit path does (or to match
> the reference dropped in the completion path).

I've just re-pushed with the helper.

>
> I have no argument with what you're doing, and I unfortunately
> feel less confident (maybe 10% less) in my walk-through of
> all the affected code than I normally am on your patches...
> But I'll call that good enough.  I only have small blocks of
> time right now and I don't want to delay things.
>
> Reviewed-by: Alex Elder <elder@xxxxxxxxxx>

Thanks for the review!

                Ilya
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [CEPH Users]     [Ceph Large]     [Information on CEPH]     [Linux BTRFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]
  Powered by Linux