Re: [PATCH v3] ceph: correctly release memory from capsnap

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, 2021-08-18 at 13:18 +0200, Ilya Dryomov wrote:
> On Wed, Aug 18, 2021 at 3:25 AM <xiubli@xxxxxxxxxx> wrote:
> > 
> > From: Xiubo Li <xiubli@xxxxxxxxxx>
> > 
> > When force umounting, it will try to remove all the session caps.
> > If there has any capsnap is in the flushing list, the remove session
> > caps callback will try to release the capsnap->flush_cap memory to
> > "ceph_cap_flush_cachep" slab cache, while which is allocated from
> > kmalloc-256 slab cache.
> > 
> > At the same time switch to list_del_init() because just in case the
> > force umount has removed it from the lists and the
> > handle_cap_flushsnap_ack() comes then the seconds list_del_init()
> > won't crash the kernel.
> > 
> > URL: https://tracker.ceph.com/issues/52283
> > Signed-off-by: Xiubo Li <xiubli@xxxxxxxxxx>
> > ---
> > 
> > V3:
> > - rebase to the upstream
> > 
> > 
> >  fs/ceph/caps.c       | 18 ++++++++++++++----
> >  fs/ceph/mds_client.c |  7 ++++---
> >  2 files changed, 18 insertions(+), 7 deletions(-)
> > 
> > diff --git a/fs/ceph/caps.c b/fs/ceph/caps.c
> > index 1b9ca437da92..e239f06babbc 100644
> > --- a/fs/ceph/caps.c
> > +++ b/fs/ceph/caps.c
> > @@ -1712,7 +1712,16 @@ int __ceph_mark_dirty_caps(struct ceph_inode_info *ci, int mask,
> > 
> >  struct ceph_cap_flush *ceph_alloc_cap_flush(void)
> >  {
> > -       return kmem_cache_alloc(ceph_cap_flush_cachep, GFP_KERNEL);
> > +       struct ceph_cap_flush *cf;
> > +
> > +       cf = kmem_cache_alloc(ceph_cap_flush_cachep, GFP_KERNEL);
> > +       /*
> > +        * caps == 0 always means for the capsnap
> > +        * caps > 0 means dirty caps being flushed
> > +        * caps == -1 means preallocated, not used yet
> > +        */
> 
> Hi Xiubo,
> 
> This comment should be in super.h, on struct ceph_cap_flush
> definition.
> 
> But more importantly, are you sure that overloading cf->caps this way
> is safe?  For example, __kick_flushing_caps() tests for cf->caps != 0
> and cf->caps == -1 would be interpreted as a cue to call __prep_cap().
> 
> Thanks,
> 
>                 Ilya
> 

The cf->caps field should get set to a sane value when it goes onto the
i_cap_flush_list, and I don't see how we'd get into testing against the
->caps field before that point. That said, this mechanism does seem a
bit fragile and subject to later breakage, and the caps code is anything
but clear and easy to follow.

pahole says that there is a 3 byte hole just after the "wake" field in
ceph_cap_flush on x86_64, and that's probably true on other arches as
well. Rather than overloading the caps field with this info, you could
add a new bool there to indicate whether it's embedded or not.


> > +       cf->caps = -1;
> > +       return cf;
> >  }
> > 
> >  void ceph_free_cap_flush(struct ceph_cap_flush *cf)
> > @@ -1747,7 +1756,7 @@ static bool __detach_cap_flush_from_mdsc(struct ceph_mds_client *mdsc,
> >                 prev->wake = true;
> >                 wake = false;
> >         }
> > -       list_del(&cf->g_list);
> > +       list_del_init(&cf->g_list);
> >         return wake;
> >  }
> > 
> > @@ -1762,7 +1771,7 @@ static bool __detach_cap_flush_from_ci(struct ceph_inode_info *ci,
> >                 prev->wake = true;
> >                 wake = false;
> >         }
> > -       list_del(&cf->i_list);
> > +       list_del_init(&cf->i_list);
> >         return wake;
> >  }
> > 
> > @@ -3642,7 +3651,8 @@ static void handle_cap_flush_ack(struct inode *inode, u64 flush_tid,
> >                 cf = list_first_entry(&to_remove,
> >                                       struct ceph_cap_flush, i_list);
> >                 list_del(&cf->i_list);
> > -               ceph_free_cap_flush(cf);
> > +               if (cf->caps)
> > +                       ceph_free_cap_flush(cf);
> >         }
> > 
> >         if (wake_ci)
> > diff --git a/fs/ceph/mds_client.c b/fs/ceph/mds_client.c
> > index 1e013fb09d73..a44adbd1841b 100644
> > --- a/fs/ceph/mds_client.c
> > +++ b/fs/ceph/mds_client.c
> > @@ -1636,7 +1636,7 @@ static int remove_session_caps_cb(struct inode *inode, struct ceph_cap *cap,
> >                 spin_lock(&mdsc->cap_dirty_lock);
> > 
> >                 list_for_each_entry(cf, &to_remove, i_list)
> > -                       list_del(&cf->g_list);
> > +                       list_del_init(&cf->g_list);
> > 
> >                 if (!list_empty(&ci->i_dirty_item)) {
> >                         pr_warn_ratelimited(
> > @@ -1688,8 +1688,9 @@ static int remove_session_caps_cb(struct inode *inode, struct ceph_cap *cap,
> >                 struct ceph_cap_flush *cf;
> >                 cf = list_first_entry(&to_remove,
> >                                       struct ceph_cap_flush, i_list);
> > -               list_del(&cf->i_list);
> > -               ceph_free_cap_flush(cf);
> > +               list_del_init(&cf->i_list);
> > +               if (cf->caps)
> > +                       ceph_free_cap_flush(cf);
> >         }
> > 
> >         wake_up_all(&ci->i_cap_wq);
> > --
> > 2.27.0
> > 

-- 
Jeff Layton <jlayton@xxxxxxxxxx>




[Index of Archives]     [CEPH Users]     [Ceph Large]     [Ceph Dev]     [Information on CEPH]     [Linux BTRFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux