Re: [PATCH v5 00/13] SIW: Request for Comments

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, Mar 12, 2019 at 10:39:04AM -0400, Dennis Dalessandro wrote:
> On 3/10/2019 7:44 PM, Parav Pandit wrote:
> > Hi Chuck,
> >
> > > -----Original Message-----
> > > From: Chuck Lever <chuck.lever@xxxxxxxxxx>
> > > Sent: Sunday, March 10, 2019 5:30 PM
> > > To: Parav Pandit <parav@xxxxxxxxxxxx>
> > > Cc: Bernard Metzler <BMT@xxxxxxxxxxxxxx>; Steve Wise
> > > <larrystevenwise@xxxxxxxxx>; linux-rdma@xxxxxxxxxxxxxxx; Leon
> > > Romanovsky <leonro@xxxxxxxxxxxx>; Yuval Shaia
> > > <yuval.shaia@xxxxxxxxxx>
> > > Subject: Re: [PATCH v5 00/13] SIW: Request for Comments
> > >
> > > Hey Parav-
> > >
> > > > On Mar 10, 2019, at 6:01 PM, Parav Pandit <parav@xxxxxxxxxxxx> wrote:
> > > >
> > > > Hi Bernard,
> > > >
> > > >
> > > > > -----Original Message-----
> > > > > From: Bernard Metzler <BMT@xxxxxxxxxxxxxx>
> > > > > Sent: Sunday, March 10, 2019 4:41 PM
> > > > > To: Parav Pandit <parav@xxxxxxxxxxxx>
> > > > > Cc: Steve Wise <larrystevenwise@xxxxxxxxx>;
> > > > > linux-rdma@xxxxxxxxxxxxxxx; Leon Romanovsky <leonro@xxxxxxxxxxxx>;
> > > > > Yuval Shaia <yuval.shaia@xxxxxxxxxx>
> > > > > Subject: RE: [PATCH v5 00/13] SIW: Request for Comments
> > > > >
> > > > > Hi Parav,
> > > > >
> > > > > I understand your point of having some common driver infrastructure
> > > > > between rxe and siw. I remember I was even proposing rxe folks
> > > > > looking into that when rxe emerged after siw was already prototyped
> > > > > and open sourced (yes, siw was first out there). There was not much
> > > > > interest from rxe folks at that point in time, which is OK and
> > > > > understood (always cool to write something new and better from scratch,
> > > who wants iWarp, etc.).
> > > > >
> > > > I didn't follow that part of the history, but it clearly proves the point that it
> > > was started incorrectly.
> > > > With all the fancy code for doorbells, with mmap() call for QP, CQ, ground
> > > reality is, it doesn't even pass a developer's silly ib_write_bw today.
> > > > And at that test shows 2Gbps bw on 64Gb RAM with 48 core cpus!
> > > >
> > > > > At the current stage of the project, it is rather contra-productive
> > > > > to completely refactor siw to match with rxe, since I do not have the
> > > > > resources to rip apart the whole thing, and I am not yet convinced it
> > > makes sense.
> > > > >
> > > > But that seems to be right approach to have modular code and reuse
> > > existing pieces.
> > > > Leon, Yuval and others resonate that too.
> > >
> > > I agree that code sharing can be valuable. However...
> > >
> > >
> > > > If existing pieces are inferior or mis fit, their technical details has to be
> > > called out.
> > > > rdmavt improved and used library model to share code between two
> > > drivers.
> > > >
> > > > > If one asks for merging these days, one asks at first for a delay in
> > > > > getting siw accepted.
> > > > >
> > > > Yuval and Leon wants to improve rxe in general and have desire to reuse
> > > rxe.
> > > > They might be able to help you. Please check with them.
> > > >
> > > > > Over time, it might indeed be a good thing to merge parts.
> > > > It should happen from beginning.
> > >
> > > I disagree that merging siw should wait for code de-duplication with rxe.
> > >
> > > The current siw is coming close to being merge-ready. I would like to see it in
> > > the kernel as soon as possible. RoCE, hard or soft, requires DCE in order to
> > > be fully useful, and thus it will never be possible to have a software RoCE
> > > driver that can operate via any Ethernet driver on any network. That is not
> > > the case with iWARP, which relies on TCP's built-in congestion avoidance,
> > > which is truly routable and available everywhere.
> >
> > Soft roce driver faces soft cpu lockup in single system at 2Gbps bandwidth with just two QPs (send and receive).
> > :-)
> > Going across a system is brave world.
> >
> > But Leon and Yuval likes to see it enhanced/used in general instead of new driver.
> > And therefore, it extends to siw efforts too.
> >
> > >
> > > The ULPs need to have a robust generic in-tree software driver as soon as we
> > > can get it to enable development, testing, and deployment in environments
> > > that have no physical RDMA device. siw is the fastest way to get that.
> > >
> > >
> > I totally agree to your point.
> > In case if you didn't notice, a loopback driver [1] that outperforms soft_roce by 25x in performance and it is more stable. :-)
> >
> > > > > As said, I'd still like to investigate that.
> > > > That will be very helpful.
> > > >
> > > > > But let the sometimes intentionally different concepts of these two
> > > > > drivers co-exist for some time out there, and let's draw conclusions
> > > > > of what to make common later.
> > > > >
> > > > It won't happen later once it is in tree. :-)
> > >
> > > There is nothing stopping code de-duplication once siw is in the kernel.
> > >
> > > The most obvious counter-example of your claim is the in-kernel ULPs which
> > > have adopted new core APIs such as ib_alloc_cq long after the ULPs were
> > > accepted and merged.
> > >
> > It happened because of active maintainers like you, Sagi and many others.
> >
> > >
> > > > > And let's be clear, lots of things between RoCE and iWarp are very
> > > different.
> > > > All rest of the core components such as ib_core, addr, rdmacm are able to
> > > handle it two different transports.
> > > > Cavium or qedr driver is dual protocol driver that support RoCE and iWarp
> > > both in single driver.
> > > >
> > > > > Not only wire protocol and lower interfaces, but also semantically.
> > > > > These things will stay different by specification, and an efficient
> > > > > implementation of those things will likely still go it's own ways.
> > > > >
> > > > We would like to understand technically why rxe is mis fit for it, and why it
> > > cannot be improved.
> > > > Lack of resources is probably not good reason.
> > >
> > > Actually I find lack of resources to be a compelling argument. Before siw is
> > > merged, Bernard and his team are the only ones who can do this work, and
> > > they are not yet familiar with the rxe code base.
> > >
> > I want Leon to explicitly ACK this point and want to listen to him what he has to say.
> > Leon?
> > Is this compelling reason for siw?
> > And therefore for loopback driver too?
> >
> > > After merging, anyone in the Linux RDMA community can help with it,
> > > including you, the person who is asking for this extra effort.
> > >
> > I asked because Leon is asking that for another simplified driver proposal.
> > Most ULPs maintainers and users would agree that rxe driver is not able to satisfy their needs
> > (stability, performance).
> > And your feedback is obviously important here.
> >
> > > It is reasonable to request that common code be refactored. It is neither fair
> > > nor productive to gate merging siw on that work.
> >
> > I do not intent to gate merging siw work by any means.
> > I want community to reach consensus on - why should we merge a new driver even though technically its feasible to share a code with existing in-tree driver.
> >
> > And therefore, that same exact reason won't be limited to just siw driver.
> >
> > > Why force Bernard to bear the burden of fixing rxe?
> >
> > As per Leon, rxe seems stable enough that doesn't need fixing.
> > I infer that from his suggestion to enhance rxe instead of a new loopback driver for roce and IB.
> > So Bernard won't need to fix it based on Leon's and Yuval's comments.
> >
> > And if rxe is not good enough, it speaks for itself.
> > Leon? Yuval?
> >
> > If Leon and Yuval didn't ask to enhance soft_roce driver, my proposal is:
> >
> > 1. siw to provide a transport agnostic user verbs like [2]
> > And improve performance incrementally with core extensions.
> > This will be similar to Jen's io_uring work to provide vendor independent io cmd and cmpl queues for block ios.
> > Currently rxe and siw creates their own rings etc.
> >
> > 2. loopback driver for RoCE and IB to reuse [2].
> > 3. soft_roce maintainer (?) to decide if they want to reuse code from siw or loopback
> >
> > [1] https://patchwork.kernel.org/patch/10831261/
> > [2] https://patchwork.kernel.org/patch/10831251/
> >
> > >
> > > Let's get siw merged, and then _everyone_ can have a say where it is
> > > valuable to share code.
> > >
> > Let's apply that to other driver proposed too?
>
> My 2 cents...
>
> If things compile and move, why not allow it into the tree.

Because upstream != trash can. We want to see involvement, commitment
and in some extent development roadmap.

Thanks

>
>
>

Attachment: signature.asc
Description: PGP signature


[Index of Archives]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Photo]     [Yosemite News]     [Yosemite Photos]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux