On 27 March 2018 at 19:13, John Spray <jspray@xxxxxxxxxx> wrote:> > So that patch is now auto-unmounting a client if any request takes > longer than 10 seconds, which is not quite right. A request can take > that long for other reasons, such as an ongoing MDS failover, or just > a very slow/busy MDS, and we wouldn't want to start aborting requests > because of that. I was going to ask about the very situation you mentioned here at the PR's page. > Instead, you could do a check in Client::tick that looks at > mds_requests.begin() (doesn't matter which request we check), and if > it has been stuck for a long time then call into > objecter->get_latest_version to ensure we have the latest OSDMap (and > blacklist) even if there's no data IO going on. Ok. I'll make the changes accordingly. > I don't think the client_unmount_on_blacklist behaviour is going to be > a good idea in practice -- if someone has a workload writing files out > to a mounted /mnt/cephfs, and it gets auto-unmounted, they'll start > writing data out to their root filesystem instead! We need to leave > the actual unmounting to the administrator so that they can stop > whatever workloads were using the mount point as well. So, would it preferable to just get rid of it or to set it to false by default? -- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html