From: Patrick Donnelly <pdonnell@xxxxxxxxxx> commit ccda9910d8490f4fb067131598e4b2e986faa5a0 upstream. Log recovered from a user's cluster: <7>[ 5413.970692] ceph: get_cap_refs 00000000958c114b ret 1 got Fr <7>[ 5413.970695] ceph: start_read 00000000958c114b, no cache cap ... <7>[ 5473.934609] ceph: my wanted = Fr, used = Fr, dirty - <7>[ 5473.934616] ceph: revocation: pAsLsXsFr -> pAsLsXs (revoking Fr) <7>[ 5473.934632] ceph: __ceph_caps_issued 00000000958c114b cap 00000000f7784259 issued pAsLsXs <7>[ 5473.934638] ceph: check_caps 10000000e68.fffffffffffffffe file_want - used Fr dirty - flushing - issued pAsLsXs revoking Fr retain pAsLsXsFsr AUTHONLY NOINVAL FLUSH_FORCE The MDS subsequently complains that the kernel client is late releasing caps. Approximately, a series of changes to this code by commits 49870056005c ("ceph: convert ceph_readpages to ceph_readahead"), 2de160417315 ("netfs: Change ->init_request() to return an error code") and a5c9dc445139 ("ceph: Make ceph_init_request() check caps on readahead") resulted in subtle resource cleanup to be missed. The main culprit is the change in error handling in 2de160417315 which meant that a failure in init_request() would no longer cause cleanup to be called. That would prevent the ceph_put_cap_refs() call which would cleanup the leaked cap ref. Cc: stable@xxxxxxxxxxxxxxx Fixes: a5c9dc445139 ("ceph: Make ceph_init_request() check caps on readahead") Link: https://tracker.ceph.com/issues/67008 Signed-off-by: Patrick Donnelly <pdonnell@xxxxxxxxxx> Reviewed-by: Ilya Dryomov <idryomov@xxxxxxxxx> Signed-off-by: Ilya Dryomov <idryomov@xxxxxxxxx> Signed-off-by: Greg Kroah-Hartman <gregkh@xxxxxxxxxxxxxxxxxxx> --- fs/ceph/addr.c | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-) --- a/fs/ceph/addr.c +++ b/fs/ceph/addr.c @@ -435,8 +435,11 @@ static int ceph_init_request(struct netf rreq->netfs_priv = priv; out: - if (ret < 0) + if (ret < 0) { + if (got) + ceph_put_cap_refs(ceph_inode(inode), got); kfree(priv); + } return ret; } Patches currently in stable-queue which might be from pdonnell@xxxxxxxxxx are queue-6.1/ceph-fix-cap-ref-leak-via-netfs-init_request.patch queue-6.1/ceph-remove-the-incorrect-fw-reference-check-when-di.patch