On Tue, Sep 29, 2020 at 4:55 PM Ilya Dryomov <idryomov@xxxxxxxxx> wrote: > > On Tue, Sep 29, 2020 at 10:28 AM Yan, Zheng <ukernel@xxxxxxxxx> wrote: > > > > On Fri, Sep 25, 2020 at 10:08 PM Jeff Layton <jlayton@xxxxxxxxxx> wrote: > > > > > > Ilya noticed that he would get spurious EACCES errors on calls done just > > > after blocklisting the client on mounts with recover_session=clean. The > > > session would get marked as REJECTED and that caused in-flight calls to > > > die with EACCES. This patchset seems to smooth over the problem, but I'm > > > not fully convinced it's the right approach. > > > > > > > the root is cause is that client does not recover session instantly > > after getting rejected by mds. Before session gets recovered, client > > continues to return error. > > Hi Zheng, > > I don't think it's about whether that happens instantly or not. > In the example from [1], the first "ls" would fail even if issued > minutes after the session reject message and the reconnect. From > the user's POV it is well after the automatic recovery promised by > recover_session=clean. > > [1] https://tracker.ceph.com/issues/47385 Reconnect should close all old session. It's likely because that client didn't detect it's blacklisted. > > Thanks, > > Ilya > > > > > > > > The potential issue I see is that the client could take cap references to > > > do a call on a session that has been blocklisted. We then queue the > > > message and reestablish the session, but we may not have been granted > > > the same caps by the MDS at that point. > > > > > > If this is a problem, then we probably need to rework it so that we > > > return a distinct error code in this situation and have the upper layers > > > issue a completely new mds request (with new cap refs, etc.) > > > > > > Obviously, that's a much more invasive approach though, so it would be > > > nice to avoid that if this would suffice. > > > > > > Jeff Layton (4): > > > ceph: don't WARN when removing caps due to blocklisting > > > ceph: don't mark mount as SHUTDOWN when recovering session > > > ceph: remove timeout on allowing reconnect after blocklisting > > > ceph: queue request when CLEANRECOVER is set > > > > > > fs/ceph/caps.c | 2 +- > > > fs/ceph/mds_client.c | 10 ++++------ > > > fs/ceph/super.c | 13 +++++++++---- > > > fs/ceph/super.h | 1 - > > > 4 files changed, 14 insertions(+), 12 deletions(-) > > > > > > -- > > > 2.26.2 > > >