On Tue, Mar 12, 2019 at 10:39:04AM -0400, Dennis Dalessandro wrote: > On 3/10/2019 7:44 PM, Parav Pandit wrote: > > Hi Chuck, > > > > > -----Original Message----- > > > From: Chuck Lever <chuck.lever@xxxxxxxxxx> > > > Sent: Sunday, March 10, 2019 5:30 PM > > > To: Parav Pandit <parav@xxxxxxxxxxxx> > > > Cc: Bernard Metzler <BMT@xxxxxxxxxxxxxx>; Steve Wise > > > <larrystevenwise@xxxxxxxxx>; linux-rdma@xxxxxxxxxxxxxxx; Leon > > > Romanovsky <leonro@xxxxxxxxxxxx>; Yuval Shaia > > > <yuval.shaia@xxxxxxxxxx> > > > Subject: Re: [PATCH v5 00/13] SIW: Request for Comments > > > > > > Hey Parav- > > > > > > > On Mar 10, 2019, at 6:01 PM, Parav Pandit <parav@xxxxxxxxxxxx> wrote: > > > > > > > > Hi Bernard, > > > > > > > > > > > > > -----Original Message----- > > > > > From: Bernard Metzler <BMT@xxxxxxxxxxxxxx> > > > > > Sent: Sunday, March 10, 2019 4:41 PM > > > > > To: Parav Pandit <parav@xxxxxxxxxxxx> > > > > > Cc: Steve Wise <larrystevenwise@xxxxxxxxx>; > > > > > linux-rdma@xxxxxxxxxxxxxxx; Leon Romanovsky <leonro@xxxxxxxxxxxx>; > > > > > Yuval Shaia <yuval.shaia@xxxxxxxxxx> > > > > > Subject: RE: [PATCH v5 00/13] SIW: Request for Comments > > > > > > > > > > Hi Parav, > > > > > > > > > > I understand your point of having some common driver infrastructure > > > > > between rxe and siw. I remember I was even proposing rxe folks > > > > > looking into that when rxe emerged after siw was already prototyped > > > > > and open sourced (yes, siw was first out there). There was not much > > > > > interest from rxe folks at that point in time, which is OK and > > > > > understood (always cool to write something new and better from scratch, > > > who wants iWarp, etc.). > > > > > > > > > I didn't follow that part of the history, but it clearly proves the point that it > > > was started incorrectly. > > > > With all the fancy code for doorbells, with mmap() call for QP, CQ, ground > > > reality is, it doesn't even pass a developer's silly ib_write_bw today. > > > > And at that test shows 2Gbps bw on 64Gb RAM with 48 core cpus! > > > > > > > > > At the current stage of the project, it is rather contra-productive > > > > > to completely refactor siw to match with rxe, since I do not have the > > > > > resources to rip apart the whole thing, and I am not yet convinced it > > > makes sense. > > > > > > > > > But that seems to be right approach to have modular code and reuse > > > existing pieces. > > > > Leon, Yuval and others resonate that too. > > > > > > I agree that code sharing can be valuable. However... > > > > > > > > > > If existing pieces are inferior or mis fit, their technical details has to be > > > called out. > > > > rdmavt improved and used library model to share code between two > > > drivers. > > > > > > > > > If one asks for merging these days, one asks at first for a delay in > > > > > getting siw accepted. > > > > > > > > > Yuval and Leon wants to improve rxe in general and have desire to reuse > > > rxe. > > > > They might be able to help you. Please check with them. > > > > > > > > > Over time, it might indeed be a good thing to merge parts. > > > > It should happen from beginning. > > > > > > I disagree that merging siw should wait for code de-duplication with rxe. > > > > > > The current siw is coming close to being merge-ready. I would like to see it in > > > the kernel as soon as possible. RoCE, hard or soft, requires DCE in order to > > > be fully useful, and thus it will never be possible to have a software RoCE > > > driver that can operate via any Ethernet driver on any network. That is not > > > the case with iWARP, which relies on TCP's built-in congestion avoidance, > > > which is truly routable and available everywhere. > > > > Soft roce driver faces soft cpu lockup in single system at 2Gbps bandwidth with just two QPs (send and receive). > > :-) > > Going across a system is brave world. > > > > But Leon and Yuval likes to see it enhanced/used in general instead of new driver. > > And therefore, it extends to siw efforts too. > > > > > > > > The ULPs need to have a robust generic in-tree software driver as soon as we > > > can get it to enable development, testing, and deployment in environments > > > that have no physical RDMA device. siw is the fastest way to get that. > > > > > > > > I totally agree to your point. > > In case if you didn't notice, a loopback driver [1] that outperforms soft_roce by 25x in performance and it is more stable. :-) > > > > > > > As said, I'd still like to investigate that. > > > > That will be very helpful. > > > > > > > > > But let the sometimes intentionally different concepts of these two > > > > > drivers co-exist for some time out there, and let's draw conclusions > > > > > of what to make common later. > > > > > > > > > It won't happen later once it is in tree. :-) > > > > > > There is nothing stopping code de-duplication once siw is in the kernel. > > > > > > The most obvious counter-example of your claim is the in-kernel ULPs which > > > have adopted new core APIs such as ib_alloc_cq long after the ULPs were > > > accepted and merged. > > > > > It happened because of active maintainers like you, Sagi and many others. > > > > > > > > > > And let's be clear, lots of things between RoCE and iWarp are very > > > different. > > > > All rest of the core components such as ib_core, addr, rdmacm are able to > > > handle it two different transports. > > > > Cavium or qedr driver is dual protocol driver that support RoCE and iWarp > > > both in single driver. > > > > > > > > > Not only wire protocol and lower interfaces, but also semantically. > > > > > These things will stay different by specification, and an efficient > > > > > implementation of those things will likely still go it's own ways. > > > > > > > > > We would like to understand technically why rxe is mis fit for it, and why it > > > cannot be improved. > > > > Lack of resources is probably not good reason. > > > > > > Actually I find lack of resources to be a compelling argument. Before siw is > > > merged, Bernard and his team are the only ones who can do this work, and > > > they are not yet familiar with the rxe code base. > > > > > I want Leon to explicitly ACK this point and want to listen to him what he has to say. > > Leon? > > Is this compelling reason for siw? > > And therefore for loopback driver too? > > > > > After merging, anyone in the Linux RDMA community can help with it, > > > including you, the person who is asking for this extra effort. > > > > > I asked because Leon is asking that for another simplified driver proposal. > > Most ULPs maintainers and users would agree that rxe driver is not able to satisfy their needs > > (stability, performance). > > And your feedback is obviously important here. > > > > > It is reasonable to request that common code be refactored. It is neither fair > > > nor productive to gate merging siw on that work. > > > > I do not intent to gate merging siw work by any means. > > I want community to reach consensus on - why should we merge a new driver even though technically its feasible to share a code with existing in-tree driver. > > > > And therefore, that same exact reason won't be limited to just siw driver. > > > > > Why force Bernard to bear the burden of fixing rxe? > > > > As per Leon, rxe seems stable enough that doesn't need fixing. > > I infer that from his suggestion to enhance rxe instead of a new loopback driver for roce and IB. > > So Bernard won't need to fix it based on Leon's and Yuval's comments. > > > > And if rxe is not good enough, it speaks for itself. > > Leon? Yuval? > > > > If Leon and Yuval didn't ask to enhance soft_roce driver, my proposal is: > > > > 1. siw to provide a transport agnostic user verbs like [2] > > And improve performance incrementally with core extensions. > > This will be similar to Jen's io_uring work to provide vendor independent io cmd and cmpl queues for block ios. > > Currently rxe and siw creates their own rings etc. > > > > 2. loopback driver for RoCE and IB to reuse [2]. > > 3. soft_roce maintainer (?) to decide if they want to reuse code from siw or loopback > > > > [1] https://patchwork.kernel.org/patch/10831261/ > > [2] https://patchwork.kernel.org/patch/10831251/ > > > > > > > > Let's get siw merged, and then _everyone_ can have a say where it is > > > valuable to share code. > > > > > Let's apply that to other driver proposed too? > > My 2 cents... > > If things compile and move, why not allow it into the tree. Because upstream != trash can. We want to see involvement, commitment and in some extent development roadmap. Thanks > > >
Attachment:
signature.asc
Description: PGP signature