Re: Potential lost receive WCs (was "[PATCH WIP 38/43]")

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, Jul 24, 2015 at 04:26:00PM -0400, Chuck Lever wrote:
> Basically RPC work flow stopped because an RPC reply never
> arrived.

Oh, that is what I expect to see.. Remebmer the cq upcall is edge
triggered, so if you leave stuff in the cq then you don't get another
upcall until another CQE is added. If adding another CQE is somehow
contingent on the CQE left behind then the scheme deadlocks.

The CQE is not lost because calling ib_poll_cq from outside the upcall
will return it.

To confirm lost you need to see ib_poll_cq return no results and
confirm an expected CQE is missing.

The driver is expected to avoid racing with the upcall and guarentee
new CQEs will trigger no matter how many CQEs are consumed by the ULP.

So, as Steve said, if the ULP leaves CQEs behind then it must do
something to guarantee that ib_poll_cq is eventually called to collect
them, or not care about forward progress on the CQ.

Does that make sense and explain what you saw?

If yes, I recommend revising the commit and comment language. CQEs are
not lost, only the upcall isn't happening.

Jason
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Photo]     [Yosemite News]     [Yosemite Photos]     [Linux Kernel]     [Linux SCSI]     [XFree86]
  Powered by Linux