Thanks Doug Ledford & Tom. I've found that QP is force switched into Error status to flush outstandting WQEs into CQ with WR_FLUSH_ERR status. On 14:47 Wed 21 Aug, Doug Ledford wrote: > On Wed, 2019-08-21 at 23:38 +0800, Liu, Changcheng wrote: > > On 09:36 Wed 21 Aug, Tom Talpey wrote: > > > On 8/21/2019 8:09 AM, Liu, Changcheng wrote: > > > > Hi all, > > > > In one system, it always frequently hit "IBV_WC_WR_FLUSH_ERR" > > > > in the WCE(work completion element) polled from completion queue > > > > bound with RQ(Receive Queue). > > > > Does anyone has some idea to debug "IBV_WC_WR_FLUSH_ERR" > > > > problem? > > > > > > > > With CX314A/40Gb NIC, I hit this error when using RC transport > > > > type with only Send Operation(IBV_WR_SEND) WR(work request) on > > > > SQ(Send Queue). > > > > Every WR only has one SGE(scatter/gather element) and all the > > > > SGE on RQ has the same size. The SGE size in SQ WR is not greater > > > > than the SGE size in RQ WR. > > > > > > > > There’s one explanation about IBV_WC_WR_FLUSH_ERR on page 114 > > > > in the "RDMA Aware Networks Programming User Manual" > > > > http://www.mellanox.com/related-docs/prod_software/RDMA_Aware_Programming_user_manual.pdf > > > > But I still didn't understand it well. How to trigger this > > > > error with a short demo program? > > > > " > > > > IBV_WC_WR_FLUSH_ERR > > > > This event is generated when an invalid remote error is > > > > thrown when the responder detects an > > > > invalid request. It may be that the operation is not > > > > supported by the request queue or there is > > > > insufficient buffer space to receive the request. > > > > " > > > > > > The most common reason for a flushed work request is loss of > > > the connection to the remote peer. This can be caused by any > > > number of conditions. > > Good diretion. I'll debug it in this way first. > > > The second-most common is a programming error in the upper > > > layer protocol. A shortage of posted receives on either peer, > > > a protection error on some buffer, etc. > > Do you mean the protection key such as l_key/r_key isn't set well? > > What's kind of protection error could trigger IBV_WC_WR_FLUSH_ERR? > > FLUSH_ERR is the error used whenever a queue pair goes into an error > state and there are still WQEs posted to the queue pair. All > outstanding WQEs are returned with the state IBV_WC_WR_FLUSH_ERR. This > is how you make sure you don't loose WQEs when the QP hits an error > state. So, literally *anything* that can cause a QP to go into an ERROR > state will result in all WQEs currently posted to the QP being sent back > with this FLUSH_ERR. FLUSH_ERR literally just means that the card is > flushing out the QP's work queue because now that the QP is in an error > state it can't process the WQEs and, presumably, the application needs > to know which ones completed and which ones didn't so it knows what to > requeue once the QP is no longer in an error state. > > As Tom has already pointed out, all of these things will throw the queue > pair into an error state and cause all posted WQEs to be flushed with > the FLUSH_ERR condition: > > 1) Loss of queue pair connection > 2) Any memory permission violation (attempt to write to read only > memory, attempt to RDMA read/write to an invalid rkey, etc) > 3) Receipt of any post_send message without a waiting post_recv buffer > to accept the message > 4) Receipt of a post_send message that is too large to fit in the first > available post_recv buffer > > A common cause of this sort of thing is when you don't do proper flow > control on the queue pair and the sending side floods the receiving side > and runs it out of posted recv WQEs. Although, in your case, you did > say this was happening on the receive queue, so that implies this is > happening on the receiving side, so if that is what's happenining here, > the process would have to be something like: > > sender starts sending data (maybe without any flow control) > receiver starts receiving data and refilling buffers > ... > receiver runs totally dry of buffers and gets an incoming recv > causing qp to go into error state > > receiver then posts refill buffers to the RQ after the QP > went into error state but before acknowledging the error state > and shutting down the recv processing thread > > all recv buffers posted as WQEs are flushed back to the process > with FLUSH_ERR because they were posted to a QP in ERROR state > > > > If you're looking to actually trigger this error for testing, > > > well, try one of the above. If you're trying to figure out > > > why it's happening, that can take some digging, but not in > > > the RDMA stack, typically. > > Many thanks. > > > > --Changcheng > > > Tom. > > > > > -- > Doug Ledford <dledford@xxxxxxxxxx> > GPG KeyID: B826A3330E572FDD > Fingerprint = AE6B 1BDA 122B 23B4 265B 1274 B826 A333 0E57 2FDD