On Thu, Jun 29, 2017 at 07:37:19PM +0300, Leon Romanovsky wrote: > On Thu, Jun 29, 2017 at 10:02:49AM -0500, Chien Tin Tung wrote: > > On Thu, Jun 29, 2017 at 08:04:36AM +0300, Leon Romanovsky wrote: > > > On Wed, Jun 28, 2017 at 03:30:03PM -0500, Chien Tin Tung wrote: > > > > On Wed, Jun 28, 2017 at 06:36:39PM +0300, Leon Romanovsky wrote: > > > > > On Wed, Jun 28, 2017 at 09:12:11AM -0500, Chien Tin Tung wrote: > > > > > > On Wed, Jun 28, 2017 at 09:02:45AM -0500, Mustafa Ismail wrote: > > > > > > > Add a wait/retry version of ibnl_unicast, ibnl_unicast_wait, > > > > > > > and modify ibnl_unicast to not wait/retry. This eliminates > > > > > > > the undesirable wait for future users of ibnl_unicast. > > > > > > > > > > > > > > Change Portmapper calls originating from kernel to user-space > > > > > > > to use ibnl_unicast_wait and take advantage of the wait/retry > > > > > > > logic in netlink_unicast. > > > > > > > > > > > > > > Signed-off-by: Mustafa Ismail <mustafa.ismail@xxxxxxxxx> > > > > > > > Signed-off-by: Chien Tin Tung <chien.tin.tung@xxxxxxxxx> > > > > > > > --- > > > > > > > drivers/infiniband/core/iwpm_msg.c | 6 +++--- > > > > > > > drivers/infiniband/core/netlink.c | 12 +++++++++++- > > > > > > > include/rdma/rdma_netlink.h | 10 ++++++++++ > > > > > > > 3 files changed, 24 insertions(+), 4 deletions(-) > > > > > > > > > > > > Please apply this patch instead of Leon's patch to revert > > > > > > "IB/core: Add flow control to the portmapper netlink calls". > > > > > > > > > > > > Leon, we can work out names and parameters if this works for you. > > > > > > > > > > Chien, > > > > > > > > > > The names are less my worries with this patch. First of all, it misleads > > > > > by using wait/retry naming, because it blocks and not waits. > > > > > > > > Nope. It does a single shot retry and waits in a waitqueue. > > > > Go look at netlink_unicast and in turn netlink_attachskb. If you still > > > > disagree, please flag specific code where it blocks. > > > > > > I agree, it wouldn't block in your scenario. However will it work in more hostile > > > environments? > > > > The reason behind the original patch was to make it more tolerable to "hostile" > > environments by giving it one more try allocate buffer to send the message. > > So yes, it does work in more hostile environment. > > Are you really calling extra load as a "hostile environment"? I'm not playing 20 questions (https://en.wikipedia.org/wiki/Twenty_Questions). If you have something in mind say so. If you are referring to your malicious user attack scenario see my response below. > > > For example, malicious user can open RDMA netlink socket directly (socket(...)), > > > set sndtimeo to be MAX_SCHEDULE_TIMEOUT - 1 (LONG_MAX - 1) > > > > How do you open the _kernel_ RDMA netlink socket directly from user-space and > > set the timeout value? > > Exactly as libnl, libmnl and rdma-core do. The netlink socket is > accessible from the user space without root access. It is the whole > purpose of it. Wow, you just uncovered a big security hole. Isn't this beyond a simple timeout value? Wouldn't the ability to set the socket to _blocking_ blow your top? > > > and send custom > > > netlink messages right to your new _wait function. If I understand correctly > > > from the code, it will add them to waitqueue and won't release skb till > > > the end of processing. > > > > The only caller of ibnl_unicast_wait is portmapper kernel client. The call is > > initiated from kernel not in response to user message so you did not understand > > the code correctly. > > Maybe and maybe you didn't read your code at all. I'm gonna give you a pass on this one. > Calls to _unicast are triggered from kernel for LS only, in you case as > an example the request to get RDMA_NL_IWPM_MAPINFO will trigger ibnl_unicast_wait. LS? I don't have your acronym database in my head... With Mustafa's patch, this is the flow with RDMA_NL_IWPM_MAPINFO from user-space into the kernel: iwpm_mapping_info_cb() iwpm_send_mapinfo() send_mapinfo_num() send_nlmsg_done() Both send_mapinfo_num and sen_nlmsg_done call ibnl_unicast() not ibnl_unicast_wait(). I know it is late where you are. So let's talk about the spirit of the patch. Anything that is in response to user-space should use ibnl_unicast. If you find something that's using ibnl_unicast_wait, then the patch is wrong and will be corrected. Now, is there actually one? Please include call flow if you find one. > > > Will it cause to mark whole netlink socket as NETLINK_S_CONGESTED? > > > > No. It has nothing to do with the new function or relevant to this discussion. > > Again, read the source, the netlink socket overrun will set that flag. Sorry, this is still beyond me. What's the connection to wait/retry? How is using wait/retry impacting setting of this flag? Are you back to your malicious user attack scenario? See response below. > > > Will other users will be able to progress with their messages or they > > > will need to wait till those _wait calls finish? > > > > Short answer is yes. With this patch other users of ibnl_unicast would not > > get the retry/wait. If you choose to use the wait version, obviousely you will > > have retry/wait. For the work you are doing, you should stick with ibnl_unicast > > and there's no impact to your code. This is the reason for the two functions as > > proposed originally. > > Again, read the code and my proposed scenario. User without root > permissions can open netlink socket (I'm doing it in rdmatool) and can > send any message he wants. He can and will overflow the calls with > ibnl_unicast_wait, so no other user will have resources available. So, in the spirit of the patch, this cannot happen. End of story. Ignoring rest of your email. > > > > > The second, I disagree with solution in kernel for user space application which can't > > > > > handle the netlink errors. > > > > > > > > There is no guarantee delivery nor blocking on send. Like I mentioned above, > > > > it is a 1 shot retry with a set wait time. The code obviousely handles error > > > > condition as it can happen. > > > > > > So, can you please refresh our memory and explain again what exactly > > > this patch is fixing if user-space handles errors correctly? > > > > You assertion of user-space flaw is incorrect. It is based on incorrect > > conclusion of deadlock and blocking. > > > > There is no deadlock and no blocking, period. > > That question doesn't assume anything. I'm still asking what exactly are > you fixing? > > > > > > So what's the issue with taking this patch instead of the revert? > > It doesn't fix anything broken in kernel. > > > > > Chien > > -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html