Re: [PATCH 4/6] osd_client, rbd: update event interface for watch/notify2

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, Jun 17, 2015 at 4:28 PM, Douglas Fuller <dfuller@xxxxxxxxxx> wrote:
>
>> On Jun 16, 2015, at 7:18 PM, Josh Durgin <jdurgin@xxxxxxxxxx> wrote:
>>
>> On 06/12/2015 08:56 AM, Douglas Fuller wrote:
>>>
>>> @@ -3132,6 +3132,26 @@ static void rbd_watch_cb(u64 ver, u64 notify_id, u8 opcode, s32 return_code,
>>>              rbd_warn(rbd_dev, "notify_ack ret %d", ret);
>>>  }
>>>
>>> +static void rbd_watch_error_cb(void *arg, u64 cookie, int err)
>>> +{
>>> +    struct rbd_device *rbd_dev = (struct rbd_device *)arg;
>>> +    int ret;
>>> +
>>> +    dout("%s: watch error %d on cookie %llu\n", rbd_dev->header_name,
>>> +            err, cookie);
>>> +    rbd_warn(rbd_dev, "%s: watch error %d on cookie %llu\n",
>>> +             rbd_dev->header_name, err, cookie);
>>> +
>>> +    /* reset watch */
>>> +    rbd_dev_refresh(rbd_dev);
>>> +    rbd_dev_header_unwatch_sync(rbd_dev);
>>> +    ret = rbd_dev_header_watch_sync(rbd_dev);
>>> +    BUG_ON(ret); /* XXX: was the image deleted? can we be more graceful? */
>>> +    rbd_dev_refresh(rbd_dev);
>>
>> Why refresh before and after unwatching? Only the second one seems
>> necessary.
>
> The first one isn’t strictly necessary; I can remove it if you want.
>
> If we get a watch error, we may very well have a situation in which we need to stop I/O to the device because the underlying image has been deleted or its features have changed. We don’t actually do that yet (we just print a warning message), but the extra refresh was to handle that case early, even before we bothered trying to re-establish the watch.--

We should remove it, for consistency if nothing else.  Also, when you
are mapping image, it's rbd_dev_header_watch_sync() that fails if the
header object doesn't exist and such.  So I'd rather it failing in the
same place if an image got deleted from under a client or something
else went wrong instead of keeping in mind that it's the get_size (or
whatever) method that is called first on refresh and expect failures
there.

Thanks,

                Ilya
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [CEPH Users]     [Ceph Large]     [Information on CEPH]     [Linux BTRFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]
  Powered by Linux