Re: [PATCH v5 00/13] SIW: Request for Comments

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hey Parav-

> On Mar 10, 2019, at 6:01 PM, Parav Pandit <parav@xxxxxxxxxxxx> wrote:
> 
> Hi Bernard,
> 
> 
>> -----Original Message-----
>> From: Bernard Metzler <BMT@xxxxxxxxxxxxxx>
>> Sent: Sunday, March 10, 2019 4:41 PM
>> To: Parav Pandit <parav@xxxxxxxxxxxx>
>> Cc: Steve Wise <larrystevenwise@xxxxxxxxx>; linux-rdma@xxxxxxxxxxxxxxx;
>> Leon Romanovsky <leonro@xxxxxxxxxxxx>; Yuval Shaia
>> <yuval.shaia@xxxxxxxxxx>
>> Subject: RE: [PATCH v5 00/13] SIW: Request for Comments
>> 
>> Hi Parav,
>> 
>> I understand your point of having some common driver infrastructure
>> between rxe and siw. I remember I was even proposing rxe folks looking into
>> that when rxe emerged after siw was already prototyped and open sourced
>> (yes, siw was first out there). There was not much interest from rxe folks at
>> that point in time, which is OK and understood (always cool to write
>> something new and better from scratch, who wants iWarp, etc.).
>> 
> I didn't follow that part of the history, but it clearly proves the point that it was started incorrectly.
> With all the fancy code for doorbells, with mmap() call for QP, CQ, ground reality is, it doesn't even pass a developer's silly ib_write_bw today.
> And at that test shows 2Gbps bw on 64Gb RAM with 48 core cpus!
> 
>> At the current stage of the project, it is rather contra-productive to
>> completely refactor siw to match with rxe, since I do not have the resources
>> to rip apart the whole thing, and I am not yet convinced it makes sense.
>> 
> But that seems to be right approach to have modular code and reuse existing pieces.
> Leon, Yuval and others resonate that too.

I agree that code sharing can be valuable. However...


> If existing pieces are inferior or mis fit, their technical details has to be called out.
> rdmavt improved and used library model to share code between two drivers.
> 
>> If one asks for merging these days, one asks at first for a delay in getting siw
>> accepted.
>> 
> Yuval and Leon wants to improve rxe in general and have desire to reuse rxe.
> They might be able to help you. Please check with them.
> 
>> Over time, it might indeed be a good thing to merge parts. 
> It should happen from beginning.

I disagree that merging siw should wait for code de-duplication with rxe.

The current siw is coming close to being merge-ready. I would like to see
it in the kernel as soon as possible. RoCE, hard or soft, requires DCE in
order to be fully useful, and thus it will never be possible to have a
software RoCE driver that can operate via any Ethernet driver on any
network. That is not the case with iWARP, which relies on TCP's built-in
congestion avoidance, which is truly routable and available everywhere.

The ULPs need to have a robust generic in-tree software driver as soon as
we can get it to enable development, testing, and deployment in environments
that have no physical RDMA device. siw is the fastest way to get that.


>> As said, I'd still like to investigate that.
> That will be very helpful.
> 
>> But let the sometimes intentionally different concepts of these two drivers
>> co-exist for some time out there, and let's draw conclusions of what to make
>> common later.
>> 
> It won't happen later once it is in tree. :-)

There is nothing stopping code de-duplication once siw is in the kernel.

The most obvious counter-example of your claim is the in-kernel ULPs which
have adopted new core APIs such as ib_alloc_cq long after the ULPs were
accepted and merged.


>> And let's be clear, lots of things between RoCE and iWarp are very different.
> All rest of the core components such as ib_core, addr, rdmacm are able to handle it two different transports.
> Cavium or qedr driver is dual protocol driver that support RoCE and iWarp both in single driver.
> 
>> Not only wire protocol and lower interfaces, but also semantically. These
>> things will stay different by specification, and an efficient implementation of
>> those things will likely still go it's own ways.
>> 
> We would like to understand technically why rxe is mis fit for it, and why it cannot be improved.
> Lack of resources is probably not good reason.

Actually I find lack of resources to be a compelling argument. Before
siw is merged, Bernard and his team are the only ones who can do this
work, and they are not yet familiar with the rxe code base.

After merging, anyone in the Linux RDMA community can help with it,
including you, the person who is asking for this extra effort.

It is reasonable to request that common code be refactored. It is
neither fair nor productive to gate merging siw on that work. Why
force Bernard to bear the burden of fixing rxe?

Let's get siw merged, and then _everyone_ can have a say where it is
valuable to share code.

My two cents...


--
Chuck Lever







[Index of Archives]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Photo]     [Yosemite News]     [Yosemite Photos]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux