On Wed, 2017-08-02 at 11:08 -0600, Jason Gunthorpe wrote: > On Wed, Aug 02, 2017 at 04:51:01PM +0000, Bart Van Assche wrote: > > > Although I do not object against the "RDMA/core: Add wait/retry version > > of ibnl_unicast" patch, I hope that you realize that it is an ugly hack > > instead of a proper fix. Anything that makes user space wait longer than > > the socket timeout, e.g. heavy swapping activity or running the user > > space software under a debugger, will cause delivery of the netlink > > message from kernel to user to fail anyway. > > Yes, I assume the iwpm people are aware of this as well, this is why I > used the phrase 'minimize loss'. > > I'm not sure I'd call it a hack though. There is no proper fix here, > messages from kernel to user space must be delivered and processed for > iwpm to function as designed. Maximizing the chance of delivering > unsolicited messages via blocking netlink_unicast is going to be a > necessary component of any design... > > The big point in all of this discussion is that none of this allows > iwpmd to ignore lost messages and it needs to have a sane well thought > out process for resynchronizing with the kernel. Particularly, the > flow that causes sockets to closed must have some kind of resync. > > But even if all that loss resynchronization works perfectly, it is > still very desirable to avoid triggering it! Hello Jason, Because it is not possible to make the current iwpm netlink code work 100% reliably, my proposal is to add a new and redesigned interface for iwpm communication between user space and kernel and to deprecate the current API. Bart.-- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html