Re: Not timing out watcher

"Serguei Bezverkhi (sbezverk)" <sbezverk@xxxxxxxxx> · Wed, 20 Dec 2017 16:01:18 +0000

Hello Jason, thank you for your prompt reply.

My setup is very simple, I have 1 Centos 7.4 VM which is a storage node which is running latest 12.2.2 Luminous and 2nd VM is Ubuntu 16.04.3 192.168.80.235 where I run local kubernetes cluster based on the master.

On client side I have ceph-common installed and I copied to /etc/ceph config and key rings from the storage.

While running my PR I noticed that rmd map was failing on a just rebooted VM because rbdStatus was finding active Watcher. Even adding 30 seconds did not help as it was not timing out at all even with no any image mapping.

As per your format 1 comment, I tried using format v2 and it was failing to map due to differences in capabilities as per rootfs suggestion I switched back to v1. Once Watcher issue is resolved I can switch back to v2 to show the exact issue I hit.

Please let me know if you need any additional info.

Thank you
Serguei

On 2017-12-20, 10:39 AM, "Jason Dillaman" <jdillama@xxxxxxxxxx> wrote:

    Can you please provide steps to repeat this scenario? What is/was the
    client running on the host at 192.168.80.235 and how did you shut down
    that client? In your PR [1], it showed a different client as a watcher
    ("192.168.80.235:0/34739158 client.64354 cookie=1"), so how did the
    previous entry get cleaned up?

    BTW -- unrelated, but k8s should be creating RBD image format 2 images
    [2]. Was that image created using an older version of k8s or did you
    override your settings to pick the deprecated v1 format?

    [1] https://github.com/kubernetes/kubernetes/pull/56651#issuecomment-352850884
    [2] https://github.com/kubernetes/kubernetes/pull/51574

    On Wed, Dec 20, 2017 at 10:24 AM, Serguei Bezverkhi (sbezverk)
    <sbezverk@xxxxxxxxx> wrote:
    > Hello,
    >
    > I hit an issue with latest Luminous when a Watcher is not timing out when the image is not mapped. It seems something similar was reported in 2016, here is the link:
    > http://lists.ceph.com/pipermail/ceph-users-ceph.com/2016-August/012140.html
    > Has it been fixed? Appreciate some help here.
    > Thank you
    > Serguei
    >
    > date; sudo rbd status raw-volume --pool kubernetes
    > Wed Dec 20 10:04:19 EST 2017
    > Watchers:
    >         watcher=192.168.80.235:0/3789045165 client.64439 cookie=1
    > date; sudo rbd status raw-volume --pool kubernetes
    > Wed Dec 20 10:04:51 EST 2017
    > Watchers:
    >         watcher=192.168.80.235:0/3789045165 client.64439 cookie=1
    > date; sudo rbd status raw-volume --pool kubernetes
    > Wed Dec 20 10:05:14 EST 2017
    > Watchers:
    >         watcher=192.168.80.235:0/3789045165 client.64439 cookie=1
    >
    > date; sudo rbd status raw-volume --pool kubernetes
    > Wed Dec 20 10:07:24 EST 2017
    > Watchers:
    >         watcher=192.168.80.235:0/3789045165 client.64439 cookie=1
    >
    > sudo ls /dev/rbd*
    > ls: cannot access '/dev/rbd*': No such file or directory
    >
    > sudo rbd info raw-volume --pool kubernetes
    > rbd image 'raw-volume':
    >         size 10240 MB in 2560 objects
    >         order 22 (4096 kB objects)
    >         block_name_prefix: rb.0.fafa.625558ec
    >         format: 1
    >
    >
    >
    > _______________________________________________
    > ceph-users mailing list
    > ceph-users@xxxxxxxxxxxxxx
    > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

    -- 
    Jason

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com