On Wed, 2017-12-13 at 13:32 +0100, Nicolas Morey-Chaisemartin wrote: > SM traps are polled through poll_cq which waited for a CQ event > before polling the CQ itself. > However it may happens that multiple completions are attached > to a single event. As stated by the ibv_get_cq_event man page, > it is required to poll the the CQ to get those completions > after the call to ibv_req_notify_cq. > > As completions need to be handled one by one in an outer function, > start by polling the CQ and return the completion (if any) before > waiting for the next completion event. > This will allow emptying all pending completions, through multiple calls > to poll_cq, before waiting for a new event. > > The buggy use case seems to appear when the master SM is switched multiple > times between two nodes. As the number of ping-pong between the SMs increases, > the number of traps sent to notify that the SM just became master increases > too. This causes burst of completions linked to a single event. > Note that the race condition is also possible in other scenario. Reviewed-by: Bart Van Assche <bart.vanassche@xxxxxxx> ��.n��������+%������w��{.n�����{���fk��ܨ}���Ơz�j:+v�����w����ޙ��&�)ߡ�a����z�ޗ���ݢj��w�f