> On May 20, 2016, at 1:38 PM, Olga Kornievskaia <aglo@xxxxxxxxx> wrote: > > Hi folks, > > I’m seeing a client behavior that I can’t explain the reason behind > (i.e. spec) so want to ask for help. > > Question #1: Should a reclaim of open state handle BAD_SEQID the same > way the normal open handles BAD_SEQID (which is: it just retries). I > can see that during recovery we might not want the recovery to be > trying forever and that’s why we fail? I don’t see anything in the > spec that talk about this… > > Question #2: Incrementing seqid of open owner. I see that when LOCK > receives an error BAD_STATEID/ADMIN_REVOKED, sequence isn’t > incremented and (recovery) OPEN is sent with the same seqid. The code > explicitly sets seqid unconfirmed. Again, I don’t know where to look > in the spec for something like this. What happens is that server > replies back with BAD_SEQID. RFC 5661 It doesn’t say it explicitly, but you only bump the seqid on success. 3.3.12. stateid4 The starting value of the "seqid" field is undefined. The server is required to increment the "seqid" field by one at each transition of the stateid. 8.2.1. Stateid Types o Stateids may represent sets of byte-range locks. All locks held on a particular file by a particular owner and gotten under the aegis of a particular open file are associated with a single stateid with the seqid being incremented whenever LOCK and LOCKU operations affect that set of locks. 8.2.2. Stateid Structure When such a set of locks is first created, the server returns a stateid with seqid value of one. On subsequent operations that modify the set of locks, the server is required to increment the "seqid" field by one whenever it returns a stateid for the same state-owner/file/type combination and there is some change in the set of locks actually designated. In this case, the server will return a stateid with an "other" field the same as previously used for that state-owner/file/type combination, with an incremented "seqid" field. This pattern continues until the seqid is incremented past NFS4_UINT32_MAX, and one (not zero) is the next seqid value. More of the same in 9.4. Stateid Seqid Values and Byte-Range Locks 12.5.3 has this to say about layout stateids, while acknowledging they are different from normal stateids: The correct "seqid" is defined as the highest "seqid" value from responses of fully processed LAYOUTGET or LAYOUTRETURN operations or arguments of a fully processed CB_LAYOUTRECALL operation. I take “fully processed” to mean successful. > > So normally, when an open fails with BAD_SEQID the client retries but > combine it with a LOCK that failed with BAD_STATEID. Then it leads to > the application failing with “bad file descriptor” because after > receiving an error for the lock and trying to do open recovery, the > code doesn’t increment seqid and open fails with BAD_SEQID and we > don’t retry the open. So now we don’t have a file descriptor and we > fail the lock. Look back at the sections above, they stated that the seqid is only changed if there is a change in the set of locks designated. The error does not change the set of locks. Your server vendor is having the same issues as you are in deciphering the rules on seqid. > > Thank you. > -- > To unsubscribe from this list: send the line "unsubscribe linux-nfs" in > the body of a message to majordomo@xxxxxxxxxxxxxxx > More majordomo info at http://vger.kernel.org/majordomo-info.html > ��.n��������+%������w��{.n�����{��w���jg��������ݢj����G�������j:+v���w�m������w�������h�����٥