On Wed, Sep 24, 2014 at 6:45 PM, Trond Myklebust <trond.myklebust@xxxxxxxxxxxxxxx> wrote: > On Wed, Sep 24, 2014 at 6:31 PM, Olga Kornievskaia <aglo@xxxxxxxxx> wrote: >> On Wed, Sep 24, 2014 at 3:57 PM, Trond Myklebust >> <trond.myklebust@xxxxxxxxxxxxxxx> wrote: >>> Hi Olga, >>> >>> On Wed, Sep 24, 2014 at 2:20 PM, Olga Kornievskaia <aglo@xxxxxxxxx> wrote: >>>> Hi Trond, >>>> >>>> nfs_delegation_claim_opens() return EAGAIN to nfs_end_delegation_return(). >>>> issync is always 0 (as its called by the >>>> nfs_client_return_marked_delegations) and it breaks out of the loop... >>>> as a result the error just doesn't get handled. >>> >>> Ah. OK, so this is being called from >>> nfs_client_return_marked_delegations. That makes sense. >>> >>> So for that case, I'd expect the call to return to the loop in >>> nfs4_state_manager(), and then to retry through that after doing >>> whatever is needed to recover. >>> Essentially, we should be setting NFS4CLNT_DELEGRETURN again, and then >>> bouncing back into nfs_client_return_marked_delegations (after all the >>> recovery work has been done). >> >> Yes I don't fully understand what it should be. It never does anything >> about recovering from the lock error and simply returns the >> delegation. Ok I don't know if it means anything to you, but the 2nd >> time around (when it returns the delegation even though it hasn't >> recovered the lock), it never goes into the >> nfs4_open_delegation_recall() because stateid condition doesn't hold >> true. >> >> If it's not too much trouble, could you explain why lock error >> shouldn't be handled as I suggested instead of resending the open with >> claim_cur over again. As I understand in your case, it'll be a series >> of successful open with claim_cur paired with a failed lock with >> err_grace. In my case, it'll be one open with claim_cur and a number >> of lock with err_grace. > > There is only 1 state manager thread allowed per nfs_client (i.e. per > server) and so we want to avoid having it busy wait in any one state > handler. Doing so would basically mean that all other state recovery > on that nfs_client is on hold; i.e. we could not deal with exceptions > like ADMIN_REVOKED, CB_PATH_DOWN, etc until the busy wait is over. > This is why that code has been designed to fall all the way back to > nfs4_state_manager() in the event of any error/exception. Ok, thanks. It make sense. And makes things complicated. I'm sure you'll beat me to figuring out why the error is not handled but I'll keep trying. > > -- > Trond Myklebust > > Linux NFS client maintainer, PrimaryData > > trond.myklebust@xxxxxxxxxxxxxxx -- To unsubscribe from this list: send the line "unsubscribe linux-nfs" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html