RE: [EXPERIMENTAL v1 0/4] RDMA loopback device

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




> -----Original Message-----
> From: Leon Romanovsky <leon@xxxxxxxxxx>
> Sent: Thursday, February 28, 2019 7:22 AM
> To: Dennis Dalessandro <dennis.dalessandro@xxxxxxxxx>
> Cc: Parav Pandit <parav@xxxxxxxxxxxx>; bvanassche@xxxxxxx; linux-
> rdma@xxxxxxxxxxxxxxx
> Subject: Re: [EXPERIMENTAL v1 0/4] RDMA loopback device
> 
> On Thu, Feb 28, 2019 at 07:39:25AM -0500, Dennis Dalessandro wrote:
> > On 2/27/2019 2:49 PM, Parav Pandit wrote:
> > >
> > >
> > > > -----Original Message-----
> > > > From: Leon Romanovsky <leon@xxxxxxxxxx>
> > > > Sent: Wednesday, February 27, 2019 1:56 AM
> > > > To: Parav Pandit <parav@xxxxxxxxxxxx>
> > > > Cc: bvanassche@xxxxxxx; linux-rdma@xxxxxxxxxxxxxxx
> > > > Subject: Re: [EXPERIMENTAL v1 0/4] RDMA loopback device
> > > >
> > > > On Wed, Feb 27, 2019 at 12:27:13AM -0600, Parav Pandit wrote:
> > > > > This patchset adds RDMA loopback driver.
> > > > > Initially for RoCE which works on lo netdevice.
> > > > >
> > > > > It is tested with with nvme fabrics over ext4, perftests, and rping.
> > > > > It only supports RC and GSI QPs.
> > > > > It supports only RoCEv2 GIDs which belongs to loopback lo netdevice.
> > > > >
> > > > > It is only posted for discussion [1].
> > > > > It is not yet ready for RFC posting or merge.
> > > >
> > > > Which type of discussion do you expect?
> > > Continuation of [1].
> > > > And can you give brief explanation why wasn't enough to extend
> rxe/siw?
> > > >
> > > Adding lo netdev to rxe is certainly an option along with cma patch in
> this series.
> > >
> > > qp state machine is around spin locks..
> > > pools doesn't use xarray that loopback uses and siw intends to use.
> > >
> > > Incidentally, 5.0.0.rc5 rxe crashes on registering memory. Didn't have
> inspiration to supply a patch.
> >
> > If rxe crashes we may want to fix it rather than creating a whole new
> > driver.
> 
> Agree
> 
> >
> > > However rxe as it stands today after several fixes from many is still not
> there.
> > > It leaks consumer index to user space and not sure its effect of it. Jason
> did talk some of the security concern I don't recall.
> > > A while back when I reviewed the code, saw things that might crash
> kernel.
> >
> > > Users complain of memory leaks, rnr retries dropping connections..
> >
> > If rxe is so broken, and there is no interest in fixing it, why do we
> > still have it? Should we just excise it from the tree?
> 
> Because reality is not so bad as Parav sees it.
> 
> Parav is speaking from his experience where I forwarded to him results of
> our regression runs over RXE, while those runs accumulated years of
> experience and checks of corner cases. Most people who are using RXE will
> never hit them.
>
More than that, I have user complains for memory leaks and connection drops in single system.
 
> >
> > > Giving low priority to most of them, I think desire to have loopback rdma
> device are below.

> > > 1. rxe is not ready for adding IB link types and large code restructure to
> avoid skb processing in it. Pretty large rewrite to skip skbs.
> > > 2. stability and reasonable performance 3. maintainability
> >
> > I don't see how this is more maintainable. We are adding a new driver,
> > a new user space provider. So I don't see that as being a reason for adding
> this.
> 
> Agree too, it is so tempting to write something new instead of fixing.
>
So lets make siw reuse or refactor rxe to fit to siw needs.

I see loopback driver as similar to netdev lo portion, block devices null_blk driver to support IB and RoCE links.
Loopback driver would be creating this loopback lo devices in containers too in future when loaded without special user involvement, similar to netdev lo.
So that rdma also gets same level of default support as net stack.

> >
> > > But if you think rxe is solid, siw should refactor the rxe code and start
> using most pieces from there, split into library for roce and iw.
> > > Once that layering is done, may be loopback can fit that as different L4 so
> that rxe uses skb, siw uses sockets, loopback uses memcpy.
> >
> > This is why rxe should have used rdmavt from the beginning and we
> > would pretty much have such a library.
> >
> > > Loopback's helper.c is intended to share code with siw for table
> resources as xarray.
> > > It also offers complete kernel level handling of data and control path
> commands and published perf numbers.
> >
> > We can debate back and forth whether this needed to be included in siw
> > and rxe, or if it and the others should have used rdmavt. However, I
> > think this is different enough of an approach that it does stand on
> > its own and could in fact be a new driver.
> >

> > The fact that rxe is broken and no one seems to want to fix it
> > shouldn't be our reason though.
> 
> The thing is that many people heard Jason complains about security issues
> with RXE, but the problem that not many heard full explanation about it. I
> didn't hear about it too.
> 
> Thanks
> 
> >
> > -Denny




[Index of Archives]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Photo]     [Yosemite News]     [Yosemite Photos]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux