On Sat, Jan 21, 2017 at 1:18 PM, Maged Mokhtar <mmokhtar@xxxxxxxxxxx> wrote: > Hi, > > If a host with a kernel mapped rbd image dies, it still keeps a watch on > the rbd image header for a timeout that seems to be determined by > ms_tcp_read_timeout ( default 15 minutes ) rather than > osd_client_watch_timeout whereas according to the docs: "If the client > loses its connection to the primary OSD for a watched object, the watch > will be removed after a timeout configured with osd_client_watch_timeout." > > It is possible to force watch removal by blacklisting the failed host, but > i was wondering if the above timeout is the correct behavior. this is > using latest 10.2.5 Yeah, it can do that in some cases because kernels up to 4.6 use the old watch-notify protocol. If you upgrade the kernel client to 4.7 or higher, all watches should get removed after osd_client_watch_timeout. Thanks, Ilya _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com