On Tue, Jul 9, 2019 at 3:23 AM Jeff Layton <jlayton@xxxxxxxxxx> wrote: > > I've been working on a patchset to add inline write support to kcephfs, > and have run across a potential race in fsync. I could use someone to > sanity check me though since I don't have a great grasp of the MDS > session handling: > > ceph_fsync() calls try_flush_caps() to flush the dirty metadata back to > the MDS when Fw caps are flushed back. try_flush_caps does this, > however: > > if (cap->session->s_state < CEPH_MDS_SESSION_OPEN) { > spin_unlock(&ci->i_ceph_lock); > goto out; > } > enum { CEPH_MDS_SESSION_NEW = 1, CEPH_MDS_SESSION_OPENING = 2, CEPH_MDS_SESSION_OPEN = 3, CEPH_MDS_SESSION_HUNG = 4, CEPH_MDS_SESSION_RESTARTING = 5, CEPH_MDS_SESSION_RECONNECTING = 6, CEPH_MDS_SESSION_CLOSING = 7, CEPH_MDS_SESSION_REJECTED = 8, }; the value of reconnect state is larger than 2 > ...at that point, try_flush_caps will return 0, and set *ptid to 0 on > the way out. ceph_fsync won't see that Fw is still dirty at that point > and won't wait, returning without flushing metadata. > > Am I missing something that prevents this? I can open a tracker bug for > this if it is a problem, but I wanted to be sure it was a bug before I > did so. > > Thanks, > -- > Jeff Layton <jlayton@xxxxxxxxxx> >