Re: questions about open state recovery and open owner seq_id management

Trond Myklebust <trondmy@xxxxxxxxxxxxxxx> · Fri, 20 May 2016 21:22:01 +0000

On 5/20/16, 17:11, "linux-nfs-owner@xxxxxxxxxxxxxxx on behalf of Thomas Haynes" <linux-nfs-owner@xxxxxxxxxxxxxxx on behalf of loghyr@xxxxxxxxxxxxxxx> wrote:

>
>> On May 20, 2016, at 1:38 PM, Olga Kornievskaia <aglo@xxxxxxxxx> wrote:
>> 
>> Hi folks,
>> 
>> I’m seeing a client behavior that I can’t explain the reason behind
>> (i.e. spec) so want to ask for help.
>> 
>> Question #1: Should a reclaim of open state handle BAD_SEQID the same
>> way the normal open handles BAD_SEQID (which is: it just retries). I
>> can see that during recovery we might not want the recovery to be
>> trying forever and that’s why we fail? I don’t see anything in the
>> spec that talk about this…
>> 
>> Question #2: Incrementing seqid of open owner. I see that when LOCK
>> receives an error BAD_STATEID/ADMIN_REVOKED, sequence isn’t
>> incremented and (recovery) OPEN is sent with the same seqid. The code
>> explicitly sets seqid unconfirmed. Again, I don’t know where to look
>> in the spec for something like this. What happens is that server
>> replies back with BAD_SEQID.
>
>RFC 5661
>
>It doesn’t say it explicitly, but you only bump the seqid on success.
>
>3.3.12.  stateid4
>
>  The starting value of the "seqid" field is
>   undefined.  The server is required to increment the "seqid" field by
>   one at each transition of the stateid. 
>
>
>8.2.1.  Stateid Types
>
>   o  Stateids may represent sets of byte-range locks.
>
>      All locks held on a particular file by a particular owner and
>      gotten under the aegis of a particular open file are associated
>      with a single stateid with the seqid being incremented whenever
>      LOCK and LOCKU operations affect that set of locks.
>
>8.2.2.  Stateid Structure
>
>   When such a set of locks is first created, the server returns a
>   stateid with seqid value of one.  On subsequent operations that
>   modify the set of locks, the server is required to increment the
>   "seqid" field by one whenever it returns a stateid for the same
>   state-owner/file/type combination and there is some change in the set
>   of locks actually designated.  In this case, the server will return a
>   stateid with an "other" field the same as previously used for that
>   state-owner/file/type combination, with an incremented "seqid" field.
>   This pattern continues until the seqid is incremented past
>   NFS4_UINT32_MAX, and one (not zero) is the next seqid value.
>
>More of the same in
>
>9.4.  Stateid Seqid Values and Byte-Range Locks
>
>12.5.3 has this to say about layout stateids, while acknowledging they
>are different from normal stateids:
>
>  The
>   correct "seqid" is defined as the highest "seqid" value from
>   responses of fully processed LAYOUTGET or LAYOUTRETURN operations or
>   arguments of a fully processed CB_LAYOUTRECALL operation.
>
>I take “fully processed” to mean successful.
>
>
>> 
>> So normally, when an open fails with BAD_SEQID the client retries but
>> combine it with a LOCK that failed with BAD_STATEID. Then it leads to
>> the application failing with “bad file descriptor” because after
>> receiving an error for the lock and trying to do open recovery, the
>> code doesn’t increment seqid and open fails with BAD_SEQID and we
>> don’t retry the open. So now we don’t have a file descriptor and we
>> fail the lock.
>
>
>Look back at the sections above, they stated that the seqid is
>only changed if there is a change in the set of locks designated. The
>error does not change the set of locks.
>
>Your server vendor is having the same issues as you are in deciphering
>the rules on seqid.
>

Oh, sorry. Did you mean NFSv4.1, Olga? If so, please disregard my earlier reply.

Trond

��.n��������+%������w��{.n�����{��w���jg��������ݢj����G�������j:+v���w�m������w�������h�����٥