Re: [RFC PATCH] ceph: guard against __ceph_remove_cap races

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, 2020-11-11 at 11:08 +0000, Luis Henriques wrote:
> Jeff Layton <jlayton@xxxxxxxxxx> writes:
> 
> > On Sat, 2019-12-14 at 10:46 +0800, Yan, Zheng wrote:
> > > On Fri, Dec 13, 2019 at 1:32 AM Jeff Layton <jlayton@xxxxxxxxxx> wrote:
> > > > I believe it's possible that we could end up with racing calls to
> > > > __ceph_remove_cap for the same cap. If that happens, the cap->ci
> > > > pointer will be zereoed out and we can hit a NULL pointer dereference.
> > > > 
> > > > Once we acquire the s_cap_lock, check for the ci pointer being NULL,
> > > > and just return without doing anything if it is.
> > > > 
> > > > URL: https://tracker.ceph.com/issues/43272
> > > > Signed-off-by: Jeff Layton <jlayton@xxxxxxxxxx>
> > > > ---
> > > >  fs/ceph/caps.c | 21 ++++++++++++++++-----
> > > >  1 file changed, 16 insertions(+), 5 deletions(-)
> > > > 
> > > > This is the only scenario that made sense to me in light of Ilya's
> > > > analysis on the tracker above. I could be off here though -- the locking
> > > > around this code is horrifically complex, and I could be missing what
> > > > should guard against this scenario.
> > > > 
> > > 
> > > I think the simpler fix is,  in trim_caps_cb, check if cap-ci is
> > > non-null before calling __ceph_remove_cap().  this should work because
> > > __ceph_remove_cap() is always called inside i_ceph_lock
> > > 
> > 
> > Is that sufficient though? The stack trace in the bug shows it being
> > called by ceph_trim_caps, but I think we could hit the same problem with
> > other __ceph_remove_cap callers, if they happen to race in at the right
> > time.
> 
> Sorry for resurrecting this old thread, but we just got a report with this
> issue on a kernel that includes commit d6e47819721a ("ceph: hold
> i_ceph_lock when removing caps for freeing inode").
> 
> Looking at the code, I believe Zheng's suggestion should work as I don't
> see any __ceph_remove_cap callers that don't hold the i_ceph_lock.  So,
> would something like the diff bellow be acceptable?
> 
> Cheers,

I'm still not convinced that's the correct fix.

Why would trim_caps_cb be subject to this race when other
__ceph_remove_cap callers are not? Maybe the right fix is to test for a
NULL cap->ci in __ceph_remove_cap and just return early if it is?

-- 
Jeff Layton <jlayton@xxxxxxxxxx>




[Index of Archives]     [CEPH Users]     [Ceph Large]     [Ceph Dev]     [Information on CEPH]     [Linux BTRFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux