On 8/18/21 7:18 PM, Ilya Dryomov wrote:
On Wed, Aug 18, 2021 at 3:25 AM <xiubli@xxxxxxxxxx> wrote:
From: Xiubo Li <xiubli@xxxxxxxxxx>
When force umounting, it will try to remove all the session caps.
If there has any capsnap is in the flushing list, the remove session
caps callback will try to release the capsnap->flush_cap memory to
"ceph_cap_flush_cachep" slab cache, while which is allocated from
kmalloc-256 slab cache.
At the same time switch to list_del_init() because just in case the
force umount has removed it from the lists and the
handle_cap_flushsnap_ack() comes then the seconds list_del_init()
won't crash the kernel.
URL: https://tracker.ceph.com/issues/52283
Signed-off-by: Xiubo Li <xiubli@xxxxxxxxxx>
---
V3:
- rebase to the upstream
fs/ceph/caps.c | 18 ++++++++++++++----
fs/ceph/mds_client.c | 7 ++++---
2 files changed, 18 insertions(+), 7 deletions(-)
diff --git a/fs/ceph/caps.c b/fs/ceph/caps.c
index 1b9ca437da92..e239f06babbc 100644
--- a/fs/ceph/caps.c
+++ b/fs/ceph/caps.c
@@ -1712,7 +1712,16 @@ int __ceph_mark_dirty_caps(struct ceph_inode_info *ci, int mask,
struct ceph_cap_flush *ceph_alloc_cap_flush(void)
{
- return kmem_cache_alloc(ceph_cap_flush_cachep, GFP_KERNEL);
+ struct ceph_cap_flush *cf;
+
+ cf = kmem_cache_alloc(ceph_cap_flush_cachep, GFP_KERNEL);
+ /*
+ * caps == 0 always means for the capsnap
+ * caps > 0 means dirty caps being flushed
+ * caps == -1 means preallocated, not used yet
+ */
Hi Xiubo,
This comment should be in super.h, on struct ceph_cap_flush
definition.
But more importantly, are you sure that overloading cf->caps this way
is safe? For example, __kick_flushing_caps() tests for cf->caps != 0
and cf->caps == -1 would be interpreted as a cue to call __prep_cap().
Hi Ilya,
Yeah, I think it's safe, because once the cf is added into the
ci->i_cap_flush_list in __mark_caps_flushing(), it will be guaranteed
that the cf->caps will be set some dirty caps, which must be > 0 or it
will trigger BUG_ON().
Here in this patch in remove_session_caps_cb() below, the to_remove list
will not only pick cf from ci->i_cap_flush_list but also from the
ci->i_prealloc_cap_flush, which hasn't been initialized and added to the
ci->i_cap_flush_list yet.
Thanks
BRs
Thanks,
Ilya
+ cf->caps = -1;
+ return cf;
}
void ceph_free_cap_flush(struct ceph_cap_flush *cf)
@@ -1747,7 +1756,7 @@ static bool __detach_cap_flush_from_mdsc(struct ceph_mds_client *mdsc,
prev->wake = true;
wake = false;
}
- list_del(&cf->g_list);
+ list_del_init(&cf->g_list);
return wake;
}
@@ -1762,7 +1771,7 @@ static bool __detach_cap_flush_from_ci(struct ceph_inode_info *ci,
prev->wake = true;
wake = false;
}
- list_del(&cf->i_list);
+ list_del_init(&cf->i_list);
return wake;
}
@@ -3642,7 +3651,8 @@ static void handle_cap_flush_ack(struct inode *inode, u64 flush_tid,
cf = list_first_entry(&to_remove,
struct ceph_cap_flush, i_list);
list_del(&cf->i_list);
- ceph_free_cap_flush(cf);
+ if (cf->caps)
+ ceph_free_cap_flush(cf);
}
if (wake_ci)
diff --git a/fs/ceph/mds_client.c b/fs/ceph/mds_client.c
index 1e013fb09d73..a44adbd1841b 100644
--- a/fs/ceph/mds_client.c
+++ b/fs/ceph/mds_client.c
@@ -1636,7 +1636,7 @@ static int remove_session_caps_cb(struct inode *inode, struct ceph_cap *cap,
spin_lock(&mdsc->cap_dirty_lock);
list_for_each_entry(cf, &to_remove, i_list)
- list_del(&cf->g_list);
+ list_del_init(&cf->g_list);
if (!list_empty(&ci->i_dirty_item)) {
pr_warn_ratelimited(
@@ -1688,8 +1688,9 @@ static int remove_session_caps_cb(struct inode *inode, struct ceph_cap *cap,
struct ceph_cap_flush *cf;
cf = list_first_entry(&to_remove,
struct ceph_cap_flush, i_list);
- list_del(&cf->i_list);
- ceph_free_cap_flush(cf);
+ list_del_init(&cf->i_list);
+ if (cf->caps)
+ ceph_free_cap_flush(cf);
}
wake_up_all(&ci->i_cap_wq);
--
2.27.0