> On Mar 11, 2021, at 10:16 AM, Olga Kornievskaia <olga.kornievskaia@xxxxxxxxx> wrote: > > On Thu, Mar 11, 2021 at 10:10 AM Chuck Lever III <chuck.lever@xxxxxxxxxx> wrote: >> >> >> >>> On Mar 10, 2021, at 5:09 PM, Olga Kornievskaia <olga.kornievskaia@xxxxxxxxx> wrote: >>> >>> On Wed, Mar 10, 2021 at 9:47 AM J. Bruce Fields <bfields@xxxxxxxxxx> wrote: >>>> >>>> On Tue, Mar 09, 2021 at 08:59:51PM +0000, Trond Myklebust wrote: >>>>> On Tue, 2021-03-09 at 15:41 -0500, Olga Kornievskaia wrote: >>>>>> On Tue, Mar 9, 2021 at 3:22 PM Trond Myklebust < >>>>>> trondmy@xxxxxxxxxxxxxxx> wrote: >>>>>>> >>>>>>> On Tue, 2021-03-09 at 10:37 -0500, J. Bruce Fields wrote: >>>>>>>> On Tue, Mar 09, 2021 at 09:41:27AM -0500, Olga Kornievskaia >>>>>>>> wrote: >>>>>>>>> From: Olga Kornievskaia <kolga@xxxxxxxxxx> >>>>>>>>> >>>>>>>>> When the server tries to do a callback and a client fails it >>>>>>>>> due to >>>>>>>>> authentication problems, we need the server to set callback >>>>>>>>> down >>>>>>>>> flag in RENEW so that client can recover. >>>>>>>> >>>>>>>> I was looking at this. It looks to me like this should really be >>>>>>>> just: >>>>>>>> >>>>>>>> case 1: >>>>>>>> if (task->tk_status) >>>>>>>> nfsd4_mark_cb_down(clp, task->tk_status); >>>>>>>> >>>>>>>> If tk_status showed an error, and the ->done method doesn't >>>>>>>> return 0 >>>>>>>> to >>>>>>>> tell us it something worth retrying, then the callback failed >>>>>>>> permanently, so we should mark the callback path down, regardless >>>>>>>> of >>>>>>>> the >>>>>>>> exact error. >>>>>>> >>>>>>> I disagree. task->tk_status could be an unhandled NFSv4 error (see >>>>>>> nfsd4_cb_recall_done()). The client might, for instance, be in the >>>>>>> process of returning the delegation being recalled. Why should that >>>>>>> result in the callback channel being marked as down? >>>>>>> >>>>>> >>>>>> Are you talking about say the connection going down and server should >>>>>> just reconnect instead of recovering the callback channel. I assumed >>>>>> that connection break is something that's not recoverable by the >>>>>> callback but perhaps I'm wrong. >>>>> >>>>> No. I'm saying that nfsd4_cb_recall_done() will return a value of '1' >>>>> for both task->tk_status == -EBADHANDLE and -NFS4ERR_BAD_STATEID. I'm >>>>> not seeing why either of those errors should be handled by marking the >>>>> callback channel as being down. >>>>> >>>>> Looking further, it seems that the same function will also return '1' >>>>> without checking the value of task->tk_status if the delegation has >>>>> been revoked or returned. So that would mean that even NFS4ERR_DELAY >>>>> could trigger the call to nfsd4_mark_cb_down() with the above change. >>>> >>>> Yeah, OK, that's wrong, apologies. >>>> >>>> I'm just a little worried about the attempt to enumerate transport level >>>> errors in nfsd4_cb_done(). Are we sure that EIO, ETIMEDOUT, EACCESS is >>>> the right list? >>> >>> Looking at call_transmit_status error handling, I don't think >>> connection errors are returned. Instead the code tries to fix the >>> connection by retrying unless the rpc_timeout is reached and then only >>> EIO,TIMEDOUT is returned. >>> >>> Can then my original patch be considered without resubmission? >> >> Bruce has authorized v1 of this patch, but that one has the >> uncorrected patch description. Post a v4? > > v1's description is accurate. It reflects that only authentication > errors are handled. Nit: The short description isn't specific to NFSv4.0, as the v3 one was. >>>> --b. >>>> >>>>> >>>>>> >>>>>>>> >>>>>>>> --b. >>>>>>>> >>>>>>>>> >>>>>>>>> Signed-off-by: Olga Kornievskaia <kolga@xxxxxxxxxx> >>>>>>>>> --- >>>>>>>>> fs/nfsd/nfs4callback.c | 1 + >>>>>>>>> 1 file changed, 1 insertion(+) >>>>>>>>> >>>>>>>>> diff --git a/fs/nfsd/nfs4callback.c b/fs/nfsd/nfs4callback.c >>>>>>>>> index 052be5bf9ef5..7325592b456e 100644 >>>>>>>>> --- a/fs/nfsd/nfs4callback.c >>>>>>>>> +++ b/fs/nfsd/nfs4callback.c >>>>>>>>> @@ -1189,6 +1189,7 @@ static void nfsd4_cb_done(struct rpc_task >>>>>>>>> *task, void *calldata) >>>>>>>>> switch (task->tk_status) { >>>>>>>>> case -EIO: >>>>>>>>> case -ETIMEDOUT: >>>>>>>>> + case -EACCES: >>>>>>>>> nfsd4_mark_cb_down(clp, task- >>>>>>>>>> tk_status); >>>>>>>>> } >>>>>>>>> break; >>>>>>>>> -- >>>>>>>>> 2.27.0 >>>>>>>>> >>>>>>>> >>>>>>> >>>>>>> -- >>>>>>> Trond Myklebust >>>>>>> Linux NFS client maintainer, Hammerspace >>>>>>> trond.myklebust@xxxxxxxxxxxxxxx >>>>>>> >>>>>>> >>>>> >>>>> -- >>>>> Trond Myklebust >>>>> Linux NFS client maintainer, Hammerspace >>>>> trond.myklebust@xxxxxxxxxxxxxxx >> >> -- >> Chuck Lever -- Chuck Lever