Re: [RFC PATCH 0/4] ceph: fix spurious recover_session=clean errors

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, Sep 29, 2020 at 12:44 PM Yan, Zheng <ukernel@xxxxxxxxx> wrote:
>
> On Tue, Sep 29, 2020 at 4:55 PM Ilya Dryomov <idryomov@xxxxxxxxx> wrote:
> >
> > On Tue, Sep 29, 2020 at 10:28 AM Yan, Zheng <ukernel@xxxxxxxxx> wrote:
> > >
> > > On Fri, Sep 25, 2020 at 10:08 PM Jeff Layton <jlayton@xxxxxxxxxx> wrote:
> > > >
> > > > Ilya noticed that he would get spurious EACCES errors on calls done just
> > > > after blocklisting the client on mounts with recover_session=clean. The
> > > > session would get marked as REJECTED and that caused in-flight calls to
> > > > die with EACCES. This patchset seems to smooth over the problem, but I'm
> > > > not fully convinced it's the right approach.
> > > >
> > >
> > > the root is cause is that client does not recover session instantly
> > > after getting rejected by mds. Before session gets recovered, client
> > > continues to return error.
> >
> > Hi Zheng,
> >
> > I don't think it's about whether that happens instantly or not.
> > In the example from [1], the first "ls" would fail even if issued
> > minutes after the session reject message and the reconnect.  From
> > the user's POV it is well after the automatic recovery promised by
> > recover_session=clean.
> >
> > [1] https://tracker.ceph.com/issues/47385
>
> Reconnect should close all old session. It's likely because that
> client didn't detect it's blacklisted.

Sorry, I should have pasted dmesg there as well.  It _does_ detect
blacklisting -- notice that I wrote "after the session reject message
and the reconnect".

Thanks,

                Ilya



[Index of Archives]     [CEPH Users]     [Ceph Large]     [Ceph Dev]     [Information on CEPH]     [Linux BTRFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux