Re: [EXPERIMENTAL v1 0/4] RDMA loopback device

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, Feb 28, 2019 at 09:38:53PM +0200, Leon Romanovsky wrote:
> On Thu, Feb 28, 2019 at 02:06:53PM +0000, Parav Pandit wrote:
> >
> >
> > > -----Original Message-----
> > > From: Dennis Dalessandro <dennis.dalessandro@xxxxxxxxx>
> > > Sent: Thursday, February 28, 2019 6:39 AM
> > > To: Parav Pandit <parav@xxxxxxxxxxxx>; Leon Romanovsky
> > > <leon@xxxxxxxxxx>
> > > Cc: bvanassche@xxxxxxx; linux-rdma@xxxxxxxxxxxxxxx
> > > Subject: Re: [EXPERIMENTAL v1 0/4] RDMA loopback device
> > >
> > > On 2/27/2019 2:49 PM, Parav Pandit wrote:
> > > >
> > > >
> > > >> -----Original Message-----
> > > >> From: Leon Romanovsky <leon@xxxxxxxxxx>
> > > >> Sent: Wednesday, February 27, 2019 1:56 AM
> > > >> To: Parav Pandit <parav@xxxxxxxxxxxx>
> > > >> Cc: bvanassche@xxxxxxx; linux-rdma@xxxxxxxxxxxxxxx
> > > >> Subject: Re: [EXPERIMENTAL v1 0/4] RDMA loopback device
> > > >>
> > > >> On Wed, Feb 27, 2019 at 12:27:13AM -0600, Parav Pandit wrote:
> > > >>> This patchset adds RDMA loopback driver.
> > > >>> Initially for RoCE which works on lo netdevice.
> > > >>>
> > > >>> It is tested with with nvme fabrics over ext4, perftests, and rping.
> > > >>> It only supports RC and GSI QPs.
> > > >>> It supports only RoCEv2 GIDs which belongs to loopback lo netdevice.
> > > >>>
> > > >>> It is only posted for discussion [1].
> > > >>> It is not yet ready for RFC posting or merge.
> > > >>
> > > >> Which type of discussion do you expect?
> > > > Continuation of [1].
> > > >> And can you give brief explanation why wasn't enough to extend rxe/siw?
> > > >>
> > > > Adding lo netdev to rxe is certainly an option along with cma patch in this
> > > series.
> > > >
> > > > qp state machine is around spin locks..
> > > > pools doesn't use xarray that loopback uses and siw intends to use.
> > > >
> > > > Incidentally, 5.0.0.rc5 rxe crashes on registering memory. Didn't have
> > > inspiration to supply a patch.
> > >
> > > If rxe crashes we may want to fix it rather than creating a whole new driver.
> > >
> > > > However rxe as it stands today after several fixes from many is still not
> > > there.
> > > > It leaks consumer index to user space and not sure its effect of it. Jason did
> > > talk some of the security concern I don't recall.
> > > > A while back when I reviewed the code, saw things that might crash kernel.
> > >
> > > > Users complain of memory leaks, rnr retries dropping connections..
> > >
> > > If rxe is so broken, and there is no interest in fixing it, why do we still have
> > > it? Should we just excise it from the tree?
> > >
> > > > Giving low priority to most of them, I think desire to have loopback rdma
> > > device are below.
> > > > 1. rxe is not ready for adding IB link types and large code restructure to
> > > avoid skb processing in it. Pretty large rewrite to skip skbs.
> > > > 2. stability and reasonable performance 3. maintainability
> > >
> > > I don't see how this is more maintainable. We are adding a new driver, a
> > > new user space provider. So I don't see that as being a reason for adding
> > > this.
> > A new user space provider is less complex at cost of system calls.
> > However it reuses most kernel pieces present today. User space driver is just a wrapper to ibv_cmd().
> > I see this approach as start on right foot with this approach by not writing new code but use existing infra.
> > And all 3 drivers (rxe, siw, loopback) reuse common user space driver, reuse resource allocator, and plugin their transport callbacks.
> > Or siw should modify the rxe instead of creating those pieces now.
> >
> > >
> > > > But if you think rxe is solid, siw should refactor the rxe code and start
> > > using most pieces from there, split into library for roce and iw.
> > > > Once that layering is done, may be loopback can fit that as different L4 so
> > > that rxe uses skb, siw uses sockets, loopback uses memcpy.
> > >
> > > This is why rxe should have used rdmavt from the beginning and we would
> > > pretty much have such a library.
> > >
> > > > Loopback's helper.c is intended to share code with siw for table resources
> > > as xarray.
> > > > It also offers complete kernel level handling of data and control path
> > > commands and published perf numbers.
> > >
> > > We can debate back and forth whether this needed to be included in siw
> > > and rxe, or if it and the others should have used rdmavt. However, I think
> > > this is different enough of an approach that it does stand on its own and
> > > could in fact be a new driver.
> > >
> >
> > > The fact that rxe is broken and no one seems to want to fix it shouldn't be
> > > our reason though.
> > Same reasoning applies to siw. It should refactor out the code such that new L4 piece can be fit in there.
> > But we are not taking that direction, same reasoning applies to similar other driver too.
> 
> We didn't deeply review SIW yet, everything before was more coding style
> bikeshedding. If you think that SIW and RXE need to be changed, feel free
> to share your opinion more loudly.

I have not really looked at SIW yet either but it seems like there would be a
lot of similarities to rxe which would be nice to consolidate especially at the
higher layers.  To be fair rdmavt had a lot of special things because of the
way the hfi1/qib hardware put packets on the wire _not_ using nor wanting
something like an skb for example.

My gut says that SIW and rxe are going to be similar, and different from
hfi1/qib (rdmavt) so I'm not sure trying to combine them will be worth the
effort.

As to this "loopback" device I'm skeptical.  SIW and rxe have use cases to
allow for interoperability/testing.

What is the real use case for this?

Ira

> 
> >
> > The main reason to not refactor rxe is, its major rewrite to support IB link without skb layer. Same refactor is needed for siw to reuse rxe.
> > And due to that I forked a new driver, but whose user space can be useable across siw and loopback, and also resource table code.
> >
> > >





[Index of Archives]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Photo]     [Yosemite News]     [Yosemite Photos]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux