On 3/30/2021 5:01 PM, Bob Pearson wrote:
Jason, Somewhere in Dotan's blog I saw him say that if you put a QP in the reset state that it - clears the SQ and RQ (if not SRQ) *AND* - also clears the completion queues
I don't think that second bullet is correct, as you point out the CQ may hold other entries, not from this QP. The volume 1 spec does say this around QP destroy in section 10.2.4.4:
It is good programming practice to modify the QP into the Error state and retrieve the relevant CQEs prior to destroying the QP. Destroying a QP does not guarantee that CQEs of that QP are deallocated from the CQ upon destruction. Even if the CQEs are already on the CQ, it might not be possible to retrieve them. It is good programming practice not to make any assumption on the number of CQEs in the CQ when destroying a QP. In order to avoid CQ overflow, it is recommended that all CQEs of the de-stroyed QP are retrieved from the CQ associated with it before resizing the CQ, attaching a new QP to the CQ or reopening the QP, if the CQ ca-pacity is limited.
There's additional supporting text in 10.3.1 around this. The QP is always transitioned to Error, then CQEs drained, then QP to Reset.
Rxe does nothing special about moving a QP to reset. A few of the Python negative test cases intentionally force the QP into the error state and then the reset state before modifying back to RTS. They fail if they do more work and expect it to succeed but get flush errors instead left over from the earlier failed test. I have not found anything in the IBA that mentions this but it could be there. This will be a little tricky if the CQ is shared between more than one QP. But easier for me to fix than changing the Python code.
I think this sounds like a test issue. Tom.
Do you know how this is supposed to work? bob