Re: [PATCH] ceph: ensure we flush delayed caps when unmounting

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, Jun 03, 2021 at 12:57:22PM -0400, Jeff Layton wrote:
> On Thu, 2021-06-03 at 09:48 -0400, Jeff Layton wrote:
> > I've seen some warnings when testing recently that indicate that there
> > are caps still delayed on the delayed list even after we've started
> > unmounting.
> > 
> > When checking delayed caps, process the whole list if we're unmounting,
> > and check for delayed caps after setting the stopping var and flushing
> > dirty caps.
> > 
> > Signed-off-by: Jeff Layton <jlayton@xxxxxxxxxx>
> > ---
> >  fs/ceph/caps.c       | 3 ++-
> >  fs/ceph/mds_client.c | 1 +
> >  2 files changed, 3 insertions(+), 1 deletion(-)
> > 
> > diff --git a/fs/ceph/caps.c b/fs/ceph/caps.c
> > index a5e93b185515..68b4c6dfe4db 100644
> > --- a/fs/ceph/caps.c
> > +++ b/fs/ceph/caps.c
> > @@ -4236,7 +4236,8 @@ void ceph_check_delayed_caps(struct ceph_mds_client *mdsc)
> >  		ci = list_first_entry(&mdsc->cap_delay_list,
> >  				      struct ceph_inode_info,
> >  				      i_cap_delay_list);
> > -		if ((ci->i_ceph_flags & CEPH_I_FLUSH) == 0 &&
> > +		if (!mdsc->stopping &&
> > +		    (ci->i_ceph_flags & CEPH_I_FLUSH) == 0 &&
> >  		    time_before(jiffies, ci->i_hold_caps_max))
> >  			break;
> >  		list_del_init(&ci->i_cap_delay_list);
> > diff --git a/fs/ceph/mds_client.c b/fs/ceph/mds_client.c
> > index e5af591d3bd4..916af5497829 100644
> > --- a/fs/ceph/mds_client.c
> > +++ b/fs/ceph/mds_client.c
> > @@ -4691,6 +4691,7 @@ void ceph_mdsc_pre_umount(struct ceph_mds_client *mdsc)
> >  
> >  	lock_unlock_sessions(mdsc);
> >  	ceph_flush_dirty_caps(mdsc);
> > +	ceph_check_delayed_caps(mdsc);
> >  	wait_requests(mdsc);
> >  
> >  	/*
> 
> I'm going to self-NAK this patch for now. Initially this looked good in
> testing, but I think it's just papering over the real problem, which is
> that ceph_async_iput can queue a job to a workqueue after the point
> where we've flushed that workqueue on umount.

Ah, yeah.  I think I saw this a few times with generic/014 (and I believe
we chatted about it on irc).  I've been on and off trying to figure out
the way to fix it but it's really tricky.

Cheers,
--
Luís


> I think the right approach is to look at how to ensure that calling iput
> doesn't end up taking these coarse-grained locks so we don't need to
> queue it in so many codepaths.
> -- 
> Jeff Layton <jlayton@xxxxxxxxxx>
> 



[Index of Archives]     [CEPH Users]     [Ceph Large]     [Ceph Dev]     [Information on CEPH]     [Linux BTRFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux