Re: bug #10915 client: hangs on umount if it had an MDS session evicted

John Spray <jspray@xxxxxxxxxx> · Tue, 27 Mar 2018 14:43:43 +0100

On Tue, Mar 27, 2018 at 2:24 PM, Rishabh Dave <ridave@xxxxxxxxxx> wrote:
> On 22 March 2018 at 17:40, John Spray <jspray@xxxxxxxxxx> wrote:
>> The client only gets osdmap updates when it tries to communicate with
>> an OSD, and the OSD tells it that its current map epoch is too old.
>>
>> In the case that the client isn't doing any data operations (i.e. no
>> osd ops), then the client doesn't find out that its blacklisted.  But
>> that's okay, because the client's awareness of its own
>> blacklisted-ness should only be needed in the case that there is some
>> dirty data that needs to be thrown away in the special if(blacklisted)
>> paths.
>>
>> So if it's not hanging on any OSD operations (those operations would
>> have resulted in an updated osdmap), the question is what is it
>> hanging on?  Is it trying to open a new session with the MDS?
>
> Looks like client still "thinks" that it has a session open (since
> condition at [1] was false when I checked it myself) and then it waits
> for a reply [2]. This is exactly where it hangs. I have written a fix
> and raised a PR for it [3]. Basically, it replaces
> caller_cond.Wait(client_lock) by caller_cond.WaitInterval(client_lock,
> utime_t(10, 0)).

So that patch is now auto-unmounting a client if any request takes
longer than 10 seconds, which is not quite right. A request can take
that long for other reasons, such as an ongoing MDS failover, or just
a very slow/busy MDS, and we wouldn't want to start aborting requests
because of that.

Instead, you could do a check in Client::tick that looks at
mds_requests.begin() (doesn't matter which request we check), and if
it has been stuck for a long time then call into
objecter->get_latest_version to ensure we have the latest OSDMap (and
blacklist) even if there's no data IO going on.

I don't think the client_unmount_on_blacklist behaviour is going to be
a good idea in practice -- if someone has a workload writing files out
to a mounted /mnt/cephfs, and it gets auto-unmounted, they'll start
writing data out to their root filesystem instead!  We need to leave
the actual unmounting to the administrator so that they can stop
whatever workloads were using the mount point as well.

John

>
> By the way, would it be correct for client to realize that it is
> blacklisted here [4]? When I checked, it wasn't so -- the (copy of)
> blacklist didn't have the client address.
>
> Also, I think it would better if an evicted/blacklisted client
> receives some sort of reply on making a request (like here [2] in this
> bug's case) from MDS that would convey that it can't access CephFS
> anymore. I don't know if this would be appropriate to do so, but this
> would not make the client wait infinitely. Though, this might be risky
> considering that client can, then, flood the MDS with requests. In
> that case, maybe MDS should send a reply to client only once to avoid
> that.

>
> [1] https://github.com/ceph/ceph/blob/master/src/client/Client.cc#L1689
> [2] https://github.com/ceph/ceph/blob/master/src/client/Client.cc#L1719
> [3] https://github.com/ceph/ceph/pull/21065
> [4] https://github.com/ceph/ceph/blob/master/src/client/Client.cc#L1658
> [5] https://github.com/ceph/ceph/blob/master/src/client/Client.cc#L2410
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html