-----"Dennis Dalessandro" <dennis.dalessandro@xxxxxxxxx> wrote: ----- >To: "Leon Romanovsky" <leonro@xxxxxxxxxxxx> >From: "Dennis Dalessandro" <dennis.dalessandro@xxxxxxxxx> >Date: 03/13/2019 04:31PM >Cc: "Parav Pandit" <parav@xxxxxxxxxxxx>, "Chuck Lever" ><chuck.lever@xxxxxxxxxx>, "Bernard Metzler" <BMT@xxxxxxxxxxxxxx>, >"Steve Wise" <larrystevenwise@xxxxxxxxx>, >"linux-rdma@xxxxxxxxxxxxxxx" <linux-rdma@xxxxxxxxxxxxxxx>, "Yuval >Shaia" <yuval.shaia@xxxxxxxxxx> >Subject: Re: [PATCH v5 00/13] SIW: Request for Comments > >On 3/12/2019 12:42 PM, Leon Romanovsky wrote: >> On Tue, Mar 12, 2019 at 10:39:04AM -0400, Dennis Dalessandro wrote: >>> On 3/10/2019 7:44 PM, Parav Pandit wrote: >>>> Hi Chuck, >>>> >>>>> -----Original Message----- >>>>> From: Chuck Lever <chuck.lever@xxxxxxxxxx> >>>>> Sent: Sunday, March 10, 2019 5:30 PM >>>>> To: Parav Pandit <parav@xxxxxxxxxxxx> >>>>> Cc: Bernard Metzler <BMT@xxxxxxxxxxxxxx>; Steve Wise >>>>> <larrystevenwise@xxxxxxxxx>; linux-rdma@xxxxxxxxxxxxxxx; Leon >>>>> Romanovsky <leonro@xxxxxxxxxxxx>; Yuval Shaia >>>>> <yuval.shaia@xxxxxxxxxx> >>>>> Subject: Re: [PATCH v5 00/13] SIW: Request for Comments >>>>> >>>>> Hey Parav- >>>>> >>>>>> On Mar 10, 2019, at 6:01 PM, Parav Pandit <parav@xxxxxxxxxxxx> >wrote: >>>>>> >>>>>> Hi Bernard, >>>>>> >>>>>> >>>>>>> -----Original Message----- >>>>>>> From: Bernard Metzler <BMT@xxxxxxxxxxxxxx> >>>>>>> Sent: Sunday, March 10, 2019 4:41 PM >>>>>>> To: Parav Pandit <parav@xxxxxxxxxxxx> >>>>>>> Cc: Steve Wise <larrystevenwise@xxxxxxxxx>; >>>>>>> linux-rdma@xxxxxxxxxxxxxxx; Leon Romanovsky ><leonro@xxxxxxxxxxxx>; >>>>>>> Yuval Shaia <yuval.shaia@xxxxxxxxxx> >>>>>>> Subject: RE: [PATCH v5 00/13] SIW: Request for Comments >>>>>>> >>>>>>> Hi Parav, >>>>>>> >>>>>>> I understand your point of having some common driver >infrastructure >>>>>>> between rxe and siw. I remember I was even proposing rxe folks >>>>>>> looking into that when rxe emerged after siw was already >prototyped >>>>>>> and open sourced (yes, siw was first out there). There was not >much >>>>>>> interest from rxe folks at that point in time, which is OK and >>>>>>> understood (always cool to write something new and better from >scratch, >>>>> who wants iWarp, etc.). >>>>>>> >>>>>> I didn't follow that part of the history, but it clearly proves >the point that it >>>>> was started incorrectly. >>>>>> With all the fancy code for doorbells, with mmap() call for QP, >CQ, ground >>>>> reality is, it doesn't even pass a developer's silly ib_write_bw >today. >>>>>> And at that test shows 2Gbps bw on 64Gb RAM with 48 core cpus! >>>>>> >>>>>>> At the current stage of the project, it is rather >contra-productive >>>>>>> to completely refactor siw to match with rxe, since I do not >have the >>>>>>> resources to rip apart the whole thing, and I am not yet >convinced it >>>>> makes sense. >>>>>>> >>>>>> But that seems to be right approach to have modular code and >reuse >>>>> existing pieces. >>>>>> Leon, Yuval and others resonate that too. >>>>> >>>>> I agree that code sharing can be valuable. However... >>>>> >>>>> >>>>>> If existing pieces are inferior or mis fit, their technical >details has to be >>>>> called out. >>>>>> rdmavt improved and used library model to share code between >two >>>>> drivers. >>>>>> >>>>>>> If one asks for merging these days, one asks at first for a >delay in >>>>>>> getting siw accepted. >>>>>>> >>>>>> Yuval and Leon wants to improve rxe in general and have desire >to reuse >>>>> rxe. >>>>>> They might be able to help you. Please check with them. >>>>>> >>>>>>> Over time, it might indeed be a good thing to merge parts. >>>>>> It should happen from beginning. >>>>> >>>>> I disagree that merging siw should wait for code de-duplication >with rxe. >>>>> >>>>> The current siw is coming close to being merge-ready. I would >like to see it in >>>>> the kernel as soon as possible. RoCE, hard or soft, requires DCE >in order to >>>>> be fully useful, and thus it will never be possible to have a >software RoCE >>>>> driver that can operate via any Ethernet driver on any network. >That is not >>>>> the case with iWARP, which relies on TCP's built-in congestion >avoidance, >>>>> which is truly routable and available everywhere. >>>> >>>> Soft roce driver faces soft cpu lockup in single system at 2Gbps >bandwidth with just two QPs (send and receive). >>>> :-) >>>> Going across a system is brave world. >>>> >>>> But Leon and Yuval likes to see it enhanced/used in general >instead of new driver. >>>> And therefore, it extends to siw efforts too. >>>> >>>>> >>>>> The ULPs need to have a robust generic in-tree software driver >as soon as we >>>>> can get it to enable development, testing, and deployment in >environments >>>>> that have no physical RDMA device. siw is the fastest way to get >that. >>>>> >>>>> >>>> I totally agree to your point. >>>> In case if you didn't notice, a loopback driver [1] that >outperforms soft_roce by 25x in performance and it is more stable. >:-) >>>> >>>>>>> As said, I'd still like to investigate that. >>>>>> That will be very helpful. >>>>>> >>>>>>> But let the sometimes intentionally different concepts of >these two >>>>>>> drivers co-exist for some time out there, and let's draw >conclusions >>>>>>> of what to make common later. >>>>>>> >>>>>> It won't happen later once it is in tree. :-) >>>>> >>>>> There is nothing stopping code de-duplication once siw is in the >kernel. >>>>> >>>>> The most obvious counter-example of your claim is the in-kernel >ULPs which >>>>> have adopted new core APIs such as ib_alloc_cq long after the >ULPs were >>>>> accepted and merged. >>>>> >>>> It happened because of active maintainers like you, Sagi and many >others. >>>> >>>>> >>>>>>> And let's be clear, lots of things between RoCE and iWarp are >very >>>>> different. >>>>>> All rest of the core components such as ib_core, addr, rdmacm >are able to >>>>> handle it two different transports. >>>>>> Cavium or qedr driver is dual protocol driver that support RoCE >and iWarp >>>>> both in single driver. >>>>>> >>>>>>> Not only wire protocol and lower interfaces, but also >semantically. >>>>>>> These things will stay different by specification, and an >efficient >>>>>>> implementation of those things will likely still go it's own >ways. >>>>>>> >>>>>> We would like to understand technically why rxe is mis fit for >it, and why it >>>>> cannot be improved. >>>>>> Lack of resources is probably not good reason. >>>>> >>>>> Actually I find lack of resources to be a compelling argument. >Before siw is >>>>> merged, Bernard and his team are the only ones who can do this >work, and >>>>> they are not yet familiar with the rxe code base. >>>>> >>>> I want Leon to explicitly ACK this point and want to listen to >him what he has to say. >>>> Leon? >>>> Is this compelling reason for siw? >>>> And therefore for loopback driver too? >>>> >>>>> After merging, anyone in the Linux RDMA community can help with >it, >>>>> including you, the person who is asking for this extra effort. >>>>> >>>> I asked because Leon is asking that for another simplified driver >proposal. >>>> Most ULPs maintainers and users would agree that rxe driver is >not able to satisfy their needs >>>> (stability, performance). >>>> And your feedback is obviously important here. >>>> >>>>> It is reasonable to request that common code be refactored. It >is neither fair >>>>> nor productive to gate merging siw on that work. >>>> >>>> I do not intent to gate merging siw work by any means. >>>> I want community to reach consensus on - why should we merge a >new driver even though technically its feasible to share a code with >existing in-tree driver. >>>> >>>> And therefore, that same exact reason won't be limited to just >siw driver. >>>> >>>>> Why force Bernard to bear the burden of fixing rxe? >>>> >>>> As per Leon, rxe seems stable enough that doesn't need fixing. >>>> I infer that from his suggestion to enhance rxe instead of a new >loopback driver for roce and IB. >>>> So Bernard won't need to fix it based on Leon's and Yuval's >comments. >>>> >>>> And if rxe is not good enough, it speaks for itself. >>>> Leon? Yuval? >>>> >>>> If Leon and Yuval didn't ask to enhance soft_roce driver, my >proposal is: >>>> >>>> 1. siw to provide a transport agnostic user verbs like [2] >>>> And improve performance incrementally with core extensions. >>>> This will be similar to Jen's io_uring work to provide vendor >independent io cmd and cmpl queues for block ios. >>>> Currently rxe and siw creates their own rings etc. >>>> >>>> 2. loopback driver for RoCE and IB to reuse [2]. >>>> 3. soft_roce maintainer (?) to decide if they want to reuse code >from siw or loopback >>>> >>>> [1] >https://urldefense.proofpoint.com/v2/url?u=https-3A__patchwork.kernel >.org_patch_10831261_&d=DwIC-g&c=jf_iaSHvJObTbx-siA1ZOg&r=2TaYXQ0T-r8Z >O1PP1alNwU_QJcRRLfmYTAgd3QCvqSc&m=_08EePmMUq466MpLUdKBkqpiAQwZO6-v3vQ >V8lJGy8Q&s=PNoLlcMS2y11ULloi6Tk8J1q2ccevH1tuvXbZ_gA16Y&e= >>>> [2] >https://urldefense.proofpoint.com/v2/url?u=https-3A__patchwork.kernel >.org_patch_10831251_&d=DwIC-g&c=jf_iaSHvJObTbx-siA1ZOg&r=2TaYXQ0T-r8Z >O1PP1alNwU_QJcRRLfmYTAgd3QCvqSc&m=_08EePmMUq466MpLUdKBkqpiAQwZO6-v3vQ >V8lJGy8Q&s=5q96F0D8jXv81QEhEfvz90wh6CxdS6TFbqOSz8YzPnw&e= >>>> >>>>> >>>>> Let's get siw merged, and then _everyone_ can have a say where >it is >>>>> valuable to share code. >>>>> >>>> Let's apply that to other driver proposed too? >>> >>> My 2 cents... >>> >>> If things compile and move, why not allow it into the tree. >> >> Because upstream != trash can. We want to see involvement, >commitment >> and in some extent development roadmap. > >First off I never suggested that, not in the slightest. Secondly it's > >not fair to label either siw or loopback drivers as trash. They both >are >cleanly (arguably) done, they both function. > >Can they run NVMEoF? If so then according to Jason [1] that is >meeting >the unambiguous low bar. To me that means they are supporting enough >to >be considered a serious driver and brought into the sub system for >further development. > I speak for myself and for siw development only here. I do this for years as a side project. We used that driver a lot in the past for RDMA application development, it is used in classes at ETH Zurich university, there are even some companies out there using it for their business. I am committed to further development and maintenance of software based RDMA. I know sometimes it takes to long before I response/fix. Like these days where I am completely consumed by my daily business. It is not really laziness, but constant lack of time to pet my pet project... One reason I am very much interested in getting it upstream is, to better align it with a moving target: the RDMA mid-layer. I spent lots of time with fixing the stand alone siw open source project for all that changes over the years. I want to be better integrated with the RDMA development community. Yes, and I hope for a code quality improvement as well. Based on community feedback, patches. I think this humble piece of code is not to bad to get improved by folks who want the functionality it provides. Thanks Bernard.