Re: handling error on RECLAIM_COMPLETE

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



> On Jan 24, 2017, at 14:50, Olga Kornievskaia <aglo@xxxxxxxxx> wrote:
> 
> On Tue, Jan 24, 2017 at 2:44 PM, Trond Myklebust
> <trondmy@xxxxxxxxxxxxxxx> wrote:
>> 
>>> On Jan 24, 2017, at 14:40, Olga Kornievskaia <aglo@xxxxxxxxx> wrote:
>>> 
>>> On Tue, Jan 24, 2017 at 2:12 PM, Trond Myklebust
>>> <trondmy@xxxxxxxxxxxxxxx> wrote:
>>>> 
>>>>> On Jan 24, 2017, at 14:06, Olga Kornievskaia <aglo@xxxxxxxxx> wrote:
>>>>> 
>>>>> Hi Trond,
>>>>> 
>>>>> Is there a reason that nfs4_proc_reclaim_complete() isn't  wrapped
>>>>> with a do while() to handle errors?
>>>> 
>>>> What do we not already handle correctly in nfs4_reclaim_complete_done()?
>>> 
>>> Could this be because when an error occurs rpc_done isn't called
>>> (rpc_release is called)? What I see is that if RECLAIM_COMPLETE gets
>>> an error (BAD_SESSION) the client just ignores it.
>>> 
>> 
>> That’s correct. Why do we need to handle BAD_SESSION there? We’re done with state recovery, so if the server rebooted, we can catch that later.
> 
> (1) don't we want to handle session errors as soon as possible?
> (2) I ran into a problem (not sure yet if reproducible) where I had a
> client stuck in an infinite loop of RECLAIM_COMPLETE being sent with
> reply of BAD_SESSION.
> 
> yes I don't know why the client is looping but it made me look into
> the fact that we are not handling session errors on reclaim complete
> which I simulated by having the server return BAD_SESSION to
> RECLAIM_COMPLETE and I see that client simply ignores it.

It doesn’t ignore it:

static int nfs41_reclaim_complete_handle_errors(struct rpc_task *task, struct nfs_client *clp)
{
        switch(task->tk_status) {
        case 0:
        case -NFS4ERR_COMPLETE_ALREADY:
        case -NFS4ERR_WRONG_CRED: /* What to do here? */
                break;
        case -NFS4ERR_DELAY:
                rpc_delay(task, NFS4_POLL_RETRY_MAX);
                /* fall through */
        case -NFS4ERR_RETRY_UNCACHED_REP:
                return -EAGAIN;
        default:
                nfs4_schedule_lease_recovery(clp);
        }
        return 0;
}

IOW: what the code does is schedule another round of lease recovery.

��.n��������+%������w��{.n�����{��w���jg��������ݢj����G�������j:+v���w�m������w�������h�����٥




[Index of Archives]     [Linux Filesystem Development]     [Linux USB Development]     [Linux Media Development]     [Video for Linux]     [Linux NILFS]     [Linux Audio Users]     [Yosemite Info]     [Linux SCSI]

  Powered by Linux