Re: bug #10915 client: hangs on umount if it had an MDS session evicted

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 22 March 2018 at 17:40, John Spray <jspray@xxxxxxxxxx> wrote:
> The client only gets osdmap updates when it tries to communicate with
> an OSD, and the OSD tells it that its current map epoch is too old.
>
> In the case that the client isn't doing any data operations (i.e. no
> osd ops), then the client doesn't find out that its blacklisted.  But
> that's okay, because the client's awareness of its own
> blacklisted-ness should only be needed in the case that there is some
> dirty data that needs to be thrown away in the special if(blacklisted)
> paths.
>
> So if it's not hanging on any OSD operations (those operations would
> have resulted in an updated osdmap), the question is what is it
> hanging on?  Is it trying to open a new session with the MDS?

Looks like client still "thinks" that it has a session open (since
condition at [1] was false when I checked it myself) and then it waits
for a reply [2]. This is exactly where it hangs. I have written a fix
and raised a PR for it [3]. Basically, it replaces
caller_cond.Wait(client_lock) by caller_cond.WaitInterval(client_lock,
utime_t(10, 0)).

By the way, would it be correct for client to realize that it is
blacklisted here [4]? When I checked, it wasn't so -- the (copy of)
blacklist didn't have the client address.

Also, I think it would better if an evicted/blacklisted client
receives some sort of reply on making a request (like here [2] in this
bug's case) from MDS that would convey that it can't access CephFS
anymore. I don't know if this would be appropriate to do so, but this
would not make the client wait infinitely. Though, this might be risky
considering that client can, then, flood the MDS with requests. In
that case, maybe MDS should send a reply to client only once to avoid
that.

[1] https://github.com/ceph/ceph/blob/master/src/client/Client.cc#L1689
[2] https://github.com/ceph/ceph/blob/master/src/client/Client.cc#L1719
[3] https://github.com/ceph/ceph/pull/21065
[4] https://github.com/ceph/ceph/blob/master/src/client/Client.cc#L1658
[5] https://github.com/ceph/ceph/blob/master/src/client/Client.cc#L2410
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [CEPH Users]     [Ceph Large]     [Information on CEPH]     [Linux BTRFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]
  Powered by Linux