On Tue, Jul 9, 2019 at 6:13 PM Jeff Layton <jlayton@xxxxxxxxxx> wrote: > > On Tue, 2019-07-09 at 07:52 +0800, Yan, Zheng wrote: > > On Tue, Jul 9, 2019 at 3:23 AM Jeff Layton <jlayton@xxxxxxxxxx> wrote: > > > I've been working on a patchset to add inline write support to kcephfs, > > > and have run across a potential race in fsync. I could use someone to > > > sanity check me though since I don't have a great grasp of the MDS > > > session handling: > > > > > > ceph_fsync() calls try_flush_caps() to flush the dirty metadata back to > > > the MDS when Fw caps are flushed back. try_flush_caps does this, > > > however: > > > > > > if (cap->session->s_state < CEPH_MDS_SESSION_OPEN) { > > > spin_unlock(&ci->i_ceph_lock); > > > goto out; > > > } > > > > > > > enum { > > CEPH_MDS_SESSION_NEW = 1, > > CEPH_MDS_SESSION_OPENING = 2, > > CEPH_MDS_SESSION_OPEN = 3, > > CEPH_MDS_SESSION_HUNG = 4, > > CEPH_MDS_SESSION_RESTARTING = 5, > > CEPH_MDS_SESSION_RECONNECTING = 6, > > CEPH_MDS_SESSION_CLOSING = 7, > > CEPH_MDS_SESSION_REJECTED = 8, > > }; > > > > the value of reconnect state is larger than 2 > > > > Right, I get that. The big question is whether you can ever move from a > higher state to something less than CEPH_MDS_SESSION_OPEN. > I guess it does not happen because closing session happens only when umounting. But there is a corner case in handle_cap_export(). It set inode's auth cap to a place holder cap. the placeholder cap's session can be in opening state. > __do_request can do this: > > if (session->s_state == CEPH_MDS_SESSION_NEW || > session->s_state == CEPH_MDS_SESSION_CLOSING) > __open_session(mdsc, session); > > ...and __open_session does this: > > session->s_state = CEPH_MDS_SESSION_OPENING; > > ...so it sort of looks like you can go from CLOSING(7) to OPENING(2). > That said, I don't have a great feel for the session state transitions, > and don't know whether this is a real possibility. > > > > > > ...at that point, try_flush_caps will return 0, and set *ptid to 0 on > > > the way out. ceph_fsync won't see that Fw is still dirty at that point > > > and won't wait, returning without flushing metadata. > > > > > > Am I missing something that prevents this? I can open a tracker bug for > > > this if it is a problem, but I wanted to be sure it was a bug before I > > > did so. > > > > > > Thanks, > > > -- > > > Jeff Layton <jlayton@xxxxxxxxxx> > > > > > -- > Jeff Layton <jlayton@xxxxxxxxxx> >