On 6/4/2020 5:21 PM, Olga Kornievskaia wrote:
Hi Trond, There is a problem with interrupted slots (yet again). We send an operation to the server and it gets interrupted by the a signal. We used to send a sole SEQUENCE to remove the problem of having real operation get an out of the cache reply and failing. Now we are not doing it again (since 3453d5708 NFSv4.1: Avoid false retries when RPC calls are interrupted"). So the problem is We bump the sequence on the next use of the slot, and get SEQ_MISORDERED.
Misordered? It sounds like the client isn't managing the sequence number, or perhaps the server never saw the original request, and is being overly strict.
We decrement the number back to the interrupted operation. This gets us a reply out of the cache. We again fail with REMOTE EIO error.
Ew. The client *decrements* the sequence? Tom.
Going back to the commit's message. I don't see the logic that the server can't tell if this is a new call or the old one. We used to send a lone SEQUENCE as a way to protect reuse of slot by a normal operation. An interrupted slot couldn't have been another SEQUENCE. So I don't see how the server can't tell a difference between SEQUENCE and any other operations.