On 4/18/22 6:25 PM, Jeff Layton wrote:
On Thu, 2022-04-14 at 13:45 +0800, Xiubo Li wrote:
Before waiting for a request's safe reply, we will send the mdlog
flush request to the relevant MDS. And this will also flush the
mdlog for all the other unsafe requests in the same session, so
we can record the last session and no need to flush mdlog again
in the next loop. But there still have cases that it may send the
mdlog flush requst twice or more, but that should be not often.
URL: https://tracker.ceph.com/issues/55284
Signed-off-by: Xiubo Li <xiubli@xxxxxxxxxx>
---
V2:
- Fixed possible NULL pointer dereference for the req->r_session
fs/ceph/mds_client.c | 11 +++++++++++
1 file changed, 11 insertions(+)
diff --git a/fs/ceph/mds_client.c b/fs/ceph/mds_client.c
index 0da85c9ce73a..4aaa7b14136e 100644
--- a/fs/ceph/mds_client.c
+++ b/fs/ceph/mds_client.c
@@ -5098,6 +5098,7 @@ void ceph_mdsc_pre_umount(struct ceph_mds_client *mdsc)
static void wait_unsafe_requests(struct ceph_mds_client *mdsc, u64 want_tid)
{
struct ceph_mds_request *req = NULL, *nextreq;
+ struct ceph_mds_session *last_session = NULL, *s;
struct rb_node *n;
mutex_lock(&mdsc->mutex);
@@ -5117,6 +5118,15 @@ static void wait_unsafe_requests(struct ceph_mds_client *mdsc, u64 want_tid)
ceph_mdsc_get_request(req);
if (nextreq)
ceph_mdsc_get_request(nextreq);
+
+ /* send flush mdlog request to MDS */
+ s = req->r_session;
+ if (s && last_session != s) {
+ send_flush_mdlog(s);
+ ceph_put_mds_session(last_session);
+ last_session = ceph_get_mds_session(s);
+ }
+
mutex_unlock(&mdsc->mutex);
dout("wait_unsafe_requests wait on %llu (want %llu)\n",
req->r_tid, want_tid);
@@ -5135,6 +5145,7 @@ static void wait_unsafe_requests(struct ceph_mds_client *mdsc, u64 want_tid)
req = nextreq;
}
mutex_unlock(&mdsc->mutex);
+ ceph_put_mds_session(last_session);
dout("wait_unsafe_requests done\n");
}
Looks reasonable. My only minor nit is that "wait_unsafe_requests" is
not really descriptive of this function anymore since you're not just
waiting on requests anymore, but also sending mdlog flush requests.
The sync handling in this code is a bit of a mess too. We have
unsafe_request_wait which is called from the fsync codepath, and then we
also have wait_unsafe_requests which is called from ceph_sync_fs. I
suspect they do enough of the same things that those could be combined.
I tried and It was hard to combine them IMO.
The fsync() will iterate the "ci->i_unsafe_iops" and
"ci->i_unsafe_dirops" first and get all the possible sessions, and then
will send flush mdlog requests to them all.
In the ceph_sync_fs() it needs to iterate the global
"mdsc->request_tree" instead.
-- Xiubo
So, I'll give my ACK on this, but wouldn't mind seeing some other
cleanup in this area.
Acked-by: Jeff Layton <jlayton@xxxxxxxxxx>