Re: ceph_fsync race with reconnect?

"Yan, Zheng" <ukernel@xxxxxxxxx> · Tue, 9 Jul 2019 07:52:50 +0800



On Tue, Jul 9, 2019 at 3:23 AM Jeff Layton <jlayton@xxxxxxxxxx> wrote:
>
> I've been working on a patchset to add inline write support to kcephfs,
> and have run across a potential race in fsync. I could use someone to
> sanity check me though since I don't have a great grasp of the MDS
> session handling:
>
> ceph_fsync() calls try_flush_caps() to flush the dirty metadata back to
> the MDS when Fw caps are flushed back.  try_flush_caps does this,
> however:
>
>                 if (cap->session->s_state < CEPH_MDS_SESSION_OPEN) {
>                         spin_unlock(&ci->i_ceph_lock);
>                         goto out;
>                 }
>

enum {
        CEPH_MDS_SESSION_NEW = 1,
        CEPH_MDS_SESSION_OPENING = 2,
        CEPH_MDS_SESSION_OPEN = 3,
        CEPH_MDS_SESSION_HUNG = 4,
        CEPH_MDS_SESSION_RESTARTING = 5,
        CEPH_MDS_SESSION_RECONNECTING = 6,
        CEPH_MDS_SESSION_CLOSING = 7,
        CEPH_MDS_SESSION_REJECTED = 8,
};

the value of reconnect state is larger than 2


> ...at that point, try_flush_caps will return 0, and set *ptid to 0 on
> the way out. ceph_fsync won't see that Fw is still dirty at that point
> and won't wait, returning without flushing metadata.
>
> Am I missing something that prevents this? I can open a tracker bug for
> this if it is a problem, but I wanted to be sure it was a bug before I
> did so.
>
> Thanks,
> --
> Jeff Layton <jlayton@xxxxxxxxxx>
>