On Tue, 2018-04-10 at 11:55 +0200, Christoph Hellwig wrote: > I don't think we need the atomic_long_cmpxchg, and can do with a plain > cmpxhg. Also unterminated cmpxchg loops are a bad idea, but I think > both callers are protected from other changes so we can turn that > into a WARN_ON(). Hello Christoph, I will remove the blk_mq_rq_update_state() function so that's one while loop less. Can you explain why you think that using cmpxchg() is safe in this context? The regular completion path and the timeout code can both execute e.g. the following code concurrently: blk_mq_change_rq_state(rq, MQ_RQ_IN_FLIGHT, MQ_RQ_COMPLETE); That's why I think that we need an atomic compare and exchange instead of cmpxchg(). What I found in the Intel Software Developer Manual seems to confirm that: Description Compares the value in the AL, AX, EAX, or RAX register with the first operand (destination operand). If the two values are equal, the second operand (source operand) is loaded into the destination operand. Otherwise, the destination operand is loaded into the AL, AX, EAX or RAX register. RAX register is available only in 64-bit mode. This instruction can be used with a LOCK prefix to allow the instruction to be executed atomically. To simplify the interface to the processor’s bus, the destination operand receives a write cycle without regard to the result of the comparison. The destination operand is written back if the comparison fails; otherwise, the source operand is written into the destination. (The processor never produces a locked read without also producing a locked write.) Thanks, Bart.