Re: [PATCH rdma-next] Revert "IB/core: Add flow control to the portmapper netlink calls"

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Sun, 2017-06-04 at 21:23 -0500, Chien Tin Tung wrote:
>  Sun, Jun 04, 2017 at 08:36:35AM +0300, Leon Romanovsky wrote:
> > 
> > On Fri, Jun 02, 2017 at 11:28:49AM -0500, Shiraz Saleem wrote:
> > > 
> > > On Wed, May 31, 2017 at 02:10:31PM -0600, Bart Van Assche wrote:
> > > > 
> > > > On Wed, 2017-05-31 at 12:42 -0500, Shiraz Saleem wrote:
> > > > > 
> > > > > > 
> > > > > > 5. I proposed a solution -> go and fix your user space
> > > > > > program.
> > > > > 
> > > > > This is a kernel patch you are trying to revert, you are
> > > > > breaking existing
> > > > > kernel functionality.  Nothing to do with user space.
> > > > > 
> > > > > Bottom line, come up with a solution that will address both
> > > > > port mapper
> > > > > functionality and your issue.
> > > > 
> > > > Hello Shiraz,
> > > > 
> > > > Sorry that this means additional work for you, but I agree with
> > > > Leon that
> > > > user space software should not assume that netlink sockets are
> > > > a reliable
> > > > communication mechanism.
> > > 
> > > Hi Bart - Thank you for your response.
> > > 
> > > The original problem was that ibnl_unicast, which is used to send
> > > nl messages from
> > > portmapper kernel space to user-space, would occasionally and
> > > momentarily fail under stress.
> > > We could have retried the call for a certain amount of time, but
> > > since netlink_unicast has a
> > > nonblock/block parameter, we chose to use the blocking option
> > > with a timeout. So we thought we
> > > did account for deadlocks with this timeout.
> > 
> > Not really, you just reduced the chances. In very large scale, you
> > will
> > have a very large chances of such deadlocks.
> 
> Please stop using the word deadlock until you can prove that the
> deadlock exists with the timeout
> in place.

He doesn't need to use the word deadlock for you to know that if you
have a non-blocking function that is failing under load, and then you
replace it with a blocking function but with a timeout, then it can
also fail under load, and therefore you have not really solved the
problem.

-- 
Doug Ledford <dledford@xxxxxxxxxx>
    GPG KeyID: B826A3330E572FDD
   
Key fingerprint = AE6B 1BDA 122B 23B4 265B  1274 B826 A333 0E57 2FDD

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Photo]     [Yosemite News]     [Yosemite Photos]     [Linux Kernel]     [Linux SCSI]     [XFree86]
  Powered by Linux