On Wed, Oct 2, 2024 at 10:08 PM Patrick Donnelly <batrick@xxxxxxxxxxxx> wrote: > > From: Patrick Donnelly <pdonnell@xxxxxxxxxx> > > Log recovered from a user's cluster: > > <7>[ 5413.970692] ceph: get_cap_refs 00000000958c114b ret 1 got Fr > <7>[ 5413.970695] ceph: start_read 00000000958c114b, no cache cap Hi Patrick, Noting that start_read() was removed in kernel 5.13 in commit 49870056005c ("ceph: convert ceph_readpages to ceph_readahead"). > ... > <7>[ 5473.934609] ceph: my wanted = Fr, used = Fr, dirty - > <7>[ 5473.934616] ceph: revocation: pAsLsXsFr -> pAsLsXs (revoking Fr) > <7>[ 5473.934632] ceph: __ceph_caps_issued 00000000958c114b cap 00000000f7784259 issued pAsLsXs > <7>[ 5473.934638] ceph: check_caps 10000000e68.fffffffffffffffe file_want - used Fr dirty - flushing - issued pAsLsXs revoking Fr retain pAsLsXsFsr AUTHONLY NOINVAL FLUSH_FORCE > > The MDS subsequently complains that the kernel client is late releasing caps. > > Closes: https://tracker.ceph.com/issues/67008 > Fixes: a5c9dc4451394b2854493944dcc0ff71af9705a3 ("ceph: Make ceph_init_request() check caps on readahead") I think it's worth going into a bit more detail here because superficially this commit just replaced ret = ceph_try_get_caps(inode, CEPH_CAP_FILE_RD, want, true, &got); if (ret < 0) dout("start_read %p, error getting cap\n", inode); else if (!(got & want)) dout("start_read %p, no cache cap\n", inode); if (ret <= 0) return; in ceph_readahead() with ret = ceph_try_get_caps(inode, CEPH_CAP_FILE_RD, want, true, &got); if (ret < 0) { dout("start_read %p, error getting cap\n", inode); return ret; } if (!(got & want)) { dout("start_read %p, no cache cap\n", inode); return -EACCES; } if (ret == 0) return -EACCES; in ceph_init_request(). Neither called ceph_put_cap_refs() before bailing. It was commit 49870056005c ("ceph: convert ceph_readpages to ceph_readahead") that turned a direct call to ceph_put_cap_refs() in start_read() to one in ceph_readahead_cleanup() (later renamed to ceph_netfs_free_request()). The actual problem is that netfs_alloc_request() just frees rreq if init_request() callout fails and ceph_netfs_free_request() is never called, right? If so, I'd mention that explicitly and possibly also reference commit 2de160417315 ("netfs: Change ->init_request() to return an error code") which introduced that. > Signed-off-by: Patrick Donnelly <pdonnell@xxxxxxxxxx> > Cc: stable@xxxxxxxxxxxxxxx > --- > fs/ceph/addr.c | 5 ++++- > 1 file changed, 4 insertions(+), 1 deletion(-) > > diff --git a/fs/ceph/addr.c b/fs/ceph/addr.c > index 53fef258c2bc..702c6a730b70 100644 > --- a/fs/ceph/addr.c > +++ b/fs/ceph/addr.c > @@ -489,8 +489,11 @@ static int ceph_init_request(struct netfs_io_request *rreq, struct file *file) > rreq->io_streams[0].sreq_max_len = fsc->mount_options->rsize; > > out: > - if (ret < 0) > + if (ret < 0) { > + if (got) > + ceph_put_cap_refs(ceph_inode(inode), got); > kfree(priv); > + } > > return ret; > } The change itself looks fine. Great catch! Thanks, Ilya