On Wed, Jul 13, 2016 at 11:47:42AM -0600, Jason Gunthorpe wrote: > On Wed, Jul 13, 2016 at 02:33:56AM -0700, Yuval Shaia wrote: > > To avoid entering into endless loop when device can't poll CQE from CQ > > driver should not reschedule if error is not -EAGAIN. > > ?? what causes ib_poll_cq to return an error? > > You need to describe the motivation here. EAGAIN is fine - HW driver returns this to indicates temporary error and caller should retry again. However, other errors (such as EINVAL) may refer to some fatal error where HW driver is unable to recover from. Two examples: - Mellanox folks may comment for example if the case where __mlx4_qp_lookup() returns NULL in function mlx4_ib_poll_one() means fatal or not. - At least by reading the of c4iw_poll_cq_one() it is clear that it may return fatal error. We must leave some exit point to HW driver to indicate a fatal and unrecoverable state. > > Jason > -- > To unsubscribe from this list: send the line "unsubscribe linux-rdma" in > the body of a message to majordomo@xxxxxxxxxxxxxxx > More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html