On 2020/5/27 20:19, Jeff Layton wrote:
On Wed, 2020-05-27 at 02:57 -0400, xiubli@xxxxxxxxxx wrote:
From: Xiubo Li <xiubli@xxxxxxxxxx>
It make no sense to check the caps when reconnecting to mds. And
for the async dirop caps, they will be put by its _cb() function,
so when releasing the requests, it will make no sense too.
URL: https://tracker.ceph.com/issues/45635
Signed-off-by: Xiubo Li <xiubli@xxxxxxxxxx>
---
Changed in V2:
- do not check the caps when reconnecting to mds
- switch ceph_async_check_caps() to ceph_async_put_cap_refs()
Changed in V3:
- fix putting the cap refs leak
Changed in V4:
- drop ceph_async_check_caps() stuff.
Sigh. I guess this will fix the original issue, but it's just more
evidence that the locking in this code are an absolute shitshow,
particularly when it comes to the session mutex.
Rather than working around it like this, we ought to be coming up with
ways to reduce the need for the session mutex altogether, particularly
in the cap handling code. Doing that means that we need to identify what
the session mutex actually protects of course, and so far I've not been
very successful in determining that.
Mostly, it looks like it's used to provide high-level serialization much
like the client mutex in libcephfs, but it's per-session, and ends up
having to be inverted wrt the mdsc mutex all over the place.
Maybe instead of this, we ought to look at getting rid of the session
mutex altogether -- fold the existing usage of it into the mdsc->mutex.
We'd end up serializing things even more that way, but the session mutex
is already so coarse-grained that it'd probably not affect performance
that much. It would certainly make the locking a _lot_ simpler.
Yeah, this sound good, but it will be a great amount of work IMO.
[...]
-void ceph_mdsc_release_dir_caps(struct ceph_mds_request *req)
+void ceph_mdsc_release_dir_caps(struct ceph_mds_request *req,
+ bool skip_checking_caps)
Can we add a ceph_mdsc_release_dir_caps_no_check() instead of a boolean
argument? That at least better communicates what the function does to
someone reading the code.
Those can just be wrappers around the function that does take a boolean
if need be (similar to ceph_put_cap_refs[_no_check_caps]).
Yeah, sure. Will fix it.
Thanks
BRs
Xiubo