-----"Leon Romanovsky" <leonro@xxxxxxxxxxxx> wrote: ----- >To: "Bernard Metzler" <BMT@xxxxxxxxxxxxxx> >From: "Leon Romanovsky" <leonro@xxxxxxxxxxxx> >Date: 03/13/2019 05:25PM >Cc: "Dennis Dalessandro" <dennis.dalessandro@xxxxxxxxx>, "Parav >Pandit" <parav@xxxxxxxxxxxx>, "Chuck Lever" <chuck.lever@xxxxxxxxxx>, >"Steve Wise" <larrystevenwise@xxxxxxxxx>, >"linux-rdma@xxxxxxxxxxxxxxx" <linux-rdma@xxxxxxxxxxxxxxx>, "Yuval >Shaia" <yuval.shaia@xxxxxxxxxx> >Subject: Re: [PATCH v5 00/13] SIW: Request for Comments > >On Wed, Mar 13, 2019 at 04:17:22PM +0000, Bernard Metzler wrote: >> -----"Dennis Dalessandro" <dennis.dalessandro@xxxxxxxxx> wrote: >----- >> >> >To: "Leon Romanovsky" <leonro@xxxxxxxxxxxx> >> >From: "Dennis Dalessandro" <dennis.dalessandro@xxxxxxxxx> >> >Date: 03/13/2019 04:31PM >> >Cc: "Parav Pandit" <parav@xxxxxxxxxxxx>, "Chuck Lever" >> ><chuck.lever@xxxxxxxxxx>, "Bernard Metzler" <BMT@xxxxxxxxxxxxxx>, >> >"Steve Wise" <larrystevenwise@xxxxxxxxx>, >> >"linux-rdma@xxxxxxxxxxxxxxx" <linux-rdma@xxxxxxxxxxxxxxx>, "Yuval >> >Shaia" <yuval.shaia@xxxxxxxxxx> >> >Subject: Re: [PATCH v5 00/13] SIW: Request for Comments >> > >> >On 3/12/2019 12:42 PM, Leon Romanovsky wrote: >> >> On Tue, Mar 12, 2019 at 10:39:04AM -0400, Dennis Dalessandro >wrote: >> >>> On 3/10/2019 7:44 PM, Parav Pandit wrote: >> >>>> Hi Chuck, >> >>>> >> >>>>> -----Original Message----- >> >>>>> From: Chuck Lever <chuck.lever@xxxxxxxxxx> >> >>>>> Sent: Sunday, March 10, 2019 5:30 PM >> >>>>> To: Parav Pandit <parav@xxxxxxxxxxxx> >> >>>>> Cc: Bernard Metzler <BMT@xxxxxxxxxxxxxx>; Steve Wise >> >>>>> <larrystevenwise@xxxxxxxxx>; linux-rdma@xxxxxxxxxxxxxxx; Leon >> >>>>> Romanovsky <leonro@xxxxxxxxxxxx>; Yuval Shaia >> >>>>> <yuval.shaia@xxxxxxxxxx> >> >>>>> Subject: Re: [PATCH v5 00/13] SIW: Request for Comments >> >>>>> >> >>>>> Hey Parav- >> >>>>> >> >>>>>> On Mar 10, 2019, at 6:01 PM, Parav Pandit ><parav@xxxxxxxxxxxx> >> >wrote: >> >>>>>> >> >>>>>> Hi Bernard, >> >>>>>> >> >>>>>> >> >>>>>>> -----Original Message----- >> >>>>>>> From: Bernard Metzler <BMT@xxxxxxxxxxxxxx> >> >>>>>>> Sent: Sunday, March 10, 2019 4:41 PM >> >>>>>>> To: Parav Pandit <parav@xxxxxxxxxxxx> >> >>>>>>> Cc: Steve Wise <larrystevenwise@xxxxxxxxx>; >> >>>>>>> linux-rdma@xxxxxxxxxxxxxxx; Leon Romanovsky >> ><leonro@xxxxxxxxxxxx>; >> >>>>>>> Yuval Shaia <yuval.shaia@xxxxxxxxxx> >> >>>>>>> Subject: RE: [PATCH v5 00/13] SIW: Request for Comments >> >>>>>>> >> >>>>>>> Hi Parav, >> >>>>>>> >> >>>>>>> I understand your point of having some common driver >> >infrastructure >> >>>>>>> between rxe and siw. I remember I was even proposing rxe >folks >> >>>>>>> looking into that when rxe emerged after siw was already >> >prototyped >> >>>>>>> and open sourced (yes, siw was first out there). There was >not >> >much >> >>>>>>> interest from rxe folks at that point in time, which is OK >and >> >>>>>>> understood (always cool to write something new and better >from >> >scratch, >> >>>>> who wants iWarp, etc.). >> >>>>>>> >> >>>>>> I didn't follow that part of the history, but it clearly >proves >> >the point that it >> >>>>> was started incorrectly. >> >>>>>> With all the fancy code for doorbells, with mmap() call for >QP, >> >CQ, ground >> >>>>> reality is, it doesn't even pass a developer's silly >ib_write_bw >> >today. >> >>>>>> And at that test shows 2Gbps bw on 64Gb RAM with 48 core >cpus! >> >>>>>> >> >>>>>>> At the current stage of the project, it is rather >> >contra-productive >> >>>>>>> to completely refactor siw to match with rxe, since I do >not >> >have the >> >>>>>>> resources to rip apart the whole thing, and I am not yet >> >convinced it >> >>>>> makes sense. >> >>>>>>> >> >>>>>> But that seems to be right approach to have modular code and >> >reuse >> >>>>> existing pieces. >> >>>>>> Leon, Yuval and others resonate that too. >> >>>>> >> >>>>> I agree that code sharing can be valuable. However... >> >>>>> >> >>>>> >> >>>>>> If existing pieces are inferior or mis fit, their technical >> >details has to be >> >>>>> called out. >> >>>>>> rdmavt improved and used library model to share code between >> >two >> >>>>> drivers. >> >>>>>> >> >>>>>>> If one asks for merging these days, one asks at first for a >> >delay in >> >>>>>>> getting siw accepted. >> >>>>>>> >> >>>>>> Yuval and Leon wants to improve rxe in general and have >desire >> >to reuse >> >>>>> rxe. >> >>>>>> They might be able to help you. Please check with them. >> >>>>>> >> >>>>>>> Over time, it might indeed be a good thing to merge parts. >> >>>>>> It should happen from beginning. >> >>>>> >> >>>>> I disagree that merging siw should wait for code >de-duplication >> >with rxe. >> >>>>> >> >>>>> The current siw is coming close to being merge-ready. I would >> >like to see it in >> >>>>> the kernel as soon as possible. RoCE, hard or soft, requires >DCE >> >in order to >> >>>>> be fully useful, and thus it will never be possible to have a >> >software RoCE >> >>>>> driver that can operate via any Ethernet driver on any >network. >> >That is not >> >>>>> the case with iWARP, which relies on TCP's built-in >congestion >> >avoidance, >> >>>>> which is truly routable and available everywhere. >> >>>> >> >>>> Soft roce driver faces soft cpu lockup in single system at >2Gbps >> >bandwidth with just two QPs (send and receive). >> >>>> :-) >> >>>> Going across a system is brave world. >> >>>> >> >>>> But Leon and Yuval likes to see it enhanced/used in general >> >instead of new driver. >> >>>> And therefore, it extends to siw efforts too. >> >>>> >> >>>>> >> >>>>> The ULPs need to have a robust generic in-tree software >driver >> >as soon as we >> >>>>> can get it to enable development, testing, and deployment in >> >environments >> >>>>> that have no physical RDMA device. siw is the fastest way to >get >> >that. >> >>>>> >> >>>>> >> >>>> I totally agree to your point. >> >>>> In case if you didn't notice, a loopback driver [1] that >> >outperforms soft_roce by 25x in performance and it is more stable. >> >:-) >> >>>> >> >>>>>>> As said, I'd still like to investigate that. >> >>>>>> That will be very helpful. >> >>>>>> >> >>>>>>> But let the sometimes intentionally different concepts of >> >these two >> >>>>>>> drivers co-exist for some time out there, and let's draw >> >conclusions >> >>>>>>> of what to make common later. >> >>>>>>> >> >>>>>> It won't happen later once it is in tree. :-) >> >>>>> >> >>>>> There is nothing stopping code de-duplication once siw is in >the >> >kernel. >> >>>>> >> >>>>> The most obvious counter-example of your claim is the >in-kernel >> >ULPs which >> >>>>> have adopted new core APIs such as ib_alloc_cq long after the >> >ULPs were >> >>>>> accepted and merged. >> >>>>> >> >>>> It happened because of active maintainers like you, Sagi and >many >> >others. >> >>>> >> >>>>> >> >>>>>>> And let's be clear, lots of things between RoCE and iWarp >are >> >very >> >>>>> different. >> >>>>>> All rest of the core components such as ib_core, addr, >rdmacm >> >are able to >> >>>>> handle it two different transports. >> >>>>>> Cavium or qedr driver is dual protocol driver that support >RoCE >> >and iWarp >> >>>>> both in single driver. >> >>>>>> >> >>>>>>> Not only wire protocol and lower interfaces, but also >> >semantically. >> >>>>>>> These things will stay different by specification, and an >> >efficient >> >>>>>>> implementation of those things will likely still go it's >own >> >ways. >> >>>>>>> >> >>>>>> We would like to understand technically why rxe is mis fit >for >> >it, and why it >> >>>>> cannot be improved. >> >>>>>> Lack of resources is probably not good reason. >> >>>>> >> >>>>> Actually I find lack of resources to be a compelling >argument. >> >Before siw is >> >>>>> merged, Bernard and his team are the only ones who can do >this >> >work, and >> >>>>> they are not yet familiar with the rxe code base. >> >>>>> >> >>>> I want Leon to explicitly ACK this point and want to listen to >> >him what he has to say. >> >>>> Leon? >> >>>> Is this compelling reason for siw? >> >>>> And therefore for loopback driver too? >> >>>> >> >>>>> After merging, anyone in the Linux RDMA community can help >with >> >it, >> >>>>> including you, the person who is asking for this extra >effort. >> >>>>> >> >>>> I asked because Leon is asking that for another simplified >driver >> >proposal. >> >>>> Most ULPs maintainers and users would agree that rxe driver is >> >not able to satisfy their needs >> >>>> (stability, performance). >> >>>> And your feedback is obviously important here. >> >>>> >> >>>>> It is reasonable to request that common code be refactored. >It >> >is neither fair >> >>>>> nor productive to gate merging siw on that work. >> >>>> >> >>>> I do not intent to gate merging siw work by any means. >> >>>> I want community to reach consensus on - why should we merge a >> >new driver even though technically its feasible to share a code >with >> >existing in-tree driver. >> >>>> >> >>>> And therefore, that same exact reason won't be limited to just >> >siw driver. >> >>>> >> >>>>> Why force Bernard to bear the burden of fixing rxe? >> >>>> >> >>>> As per Leon, rxe seems stable enough that doesn't need fixing. >> >>>> I infer that from his suggestion to enhance rxe instead of a >new >> >loopback driver for roce and IB. >> >>>> So Bernard won't need to fix it based on Leon's and Yuval's >> >comments. >> >>>> >> >>>> And if rxe is not good enough, it speaks for itself. >> >>>> Leon? Yuval? >> >>>> >> >>>> If Leon and Yuval didn't ask to enhance soft_roce driver, my >> >proposal is: >> >>>> >> >>>> 1. siw to provide a transport agnostic user verbs like [2] >> >>>> And improve performance incrementally with core extensions. >> >>>> This will be similar to Jen's io_uring work to provide vendor >> >independent io cmd and cmpl queues for block ios. >> >>>> Currently rxe and siw creates their own rings etc. >> >>>> >> >>>> 2. loopback driver for RoCE and IB to reuse [2]. >> >>>> 3. soft_roce maintainer (?) to decide if they want to reuse >code >> >from siw or loopback >> >>>> >> >>>> [1] >> >>https://urldefense.proofpoint.com/v2/url?u=https-3A__patchwork.kerne >l >> >>.org_patch_10831261_&d=DwIC-g&c=jf_iaSHvJObTbx-siA1ZOg&r=2TaYXQ0T-r8 >Z >> >>O1PP1alNwU_QJcRRLfmYTAgd3QCvqSc&m=_08EePmMUq466MpLUdKBkqpiAQwZO6-v3v >Q >> >V8lJGy8Q&s=PNoLlcMS2y11ULloi6Tk8J1q2ccevH1tuvXbZ_gA16Y&e= >> >>>> [2] >> >>https://urldefense.proofpoint.com/v2/url?u=https-3A__patchwork.kerne >l >> >>.org_patch_10831251_&d=DwIC-g&c=jf_iaSHvJObTbx-siA1ZOg&r=2TaYXQ0T-r8 >Z >> >>O1PP1alNwU_QJcRRLfmYTAgd3QCvqSc&m=_08EePmMUq466MpLUdKBkqpiAQwZO6-v3v >Q >> >V8lJGy8Q&s=5q96F0D8jXv81QEhEfvz90wh6CxdS6TFbqOSz8YzPnw&e= >> >>>> >> >>>>> >> >>>>> Let's get siw merged, and then _everyone_ can have a say >where >> >it is >> >>>>> valuable to share code. >> >>>>> >> >>>> Let's apply that to other driver proposed too? >> >>> >> >>> My 2 cents... >> >>> >> >>> If things compile and move, why not allow it into the tree. >> >> >> >> Because upstream != trash can. We want to see involvement, >> >commitment >> >> and in some extent development roadmap. >> > >> >First off I never suggested that, not in the slightest. Secondly >it's >> > >> >not fair to label either siw or loopback drivers as trash. They >both >> >are >> >cleanly (arguably) done, they both function. >> > >> >Can they run NVMEoF? If so then according to Jason [1] that is >> >meeting >> >the unambiguous low bar. To me that means they are supporting >enough >> >to >> >be considered a serious driver and brought into the sub system for >> >further development. >> > >> >> I speak for myself and for siw development only here. I do this for >> years as a side project. We used that driver a lot in the past for >> RDMA application development, it is used in classes at ETH Zurich >> university, there are even some companies out there using it for >their >> business. >> >> I am committed to further development and maintenance of software >> based RDMA. I know sometimes it takes to long before I >response/fix. >> Like these days where I am completely consumed by my daily >business. >> It is not really laziness, but constant lack of time to pet my >> pet project... One reason I am very much interested in getting it >> upstream is, to better align it with a moving target: the RDMA >> mid-layer. I spent lots of time with fixing the stand alone siw >open >> source project for all that changes over the years. I want to be >> better integrated with the RDMA development community. >> Yes, and I hope for a code quality improvement as well. Based on >community >> feedback, patches. I think this humble piece of code is not to bad >to >> get improved by folks who want the functionality it provides. > >Guys, > >I didn't mean anything bad about siw or loopback. Just responded on >the >Dennis's sentence "If things compile and move, why not allow it into >the tree." > >I'm pretty convinced that SIW will be part of next kernel release. > >Sorry if I was too rude. > >Thanks > Hi Leon, this is very much appreciated. And I completely agree with a demand for high code quality. Maybe even especially true for software which is not bound to certain hardware - where one could assume constant maintenance interest over some longer time period. Best regards, Bernard.