Re: [PATCH v7 00/12] SIW: Request for Comments

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



-----"Sagi Grimberg" <sagi@xxxxxxxxxxx> wrote: -----

>To: "Jason Gunthorpe" <jgg@xxxxxxxx>, "Olga Kornievskaia"
><aglo@xxxxxxxxx>
>From: "Sagi Grimberg" <sagi@xxxxxxxxxxx>
>Date: 04/25/2019 09:07AM
>Cc: "Bernard Metzler" <bmt@xxxxxxxxxxxxxx>, "linux-rdma"
><linux-rdma@xxxxxxxxxxxxxxx>
>Subject: Re: [PATCH v7 00/12] SIW: Request for Comments
>
>>> Hi Jason,
>>>
>>> I'd like to provide my feedback about testing this code and
>running
>>> NFS over RDMA over the software iWarp. With much appreciated help
>from
>>> Bernard, I setup 2 CentOS 7.6 VMs and his v7 kernel branch. I
>>> successfully, ran NFS connectathon test suite, xfstests, and ran
>"make
>>> -j" compile of the linux kernel. Current code is useful for
>NFSoRDMA
>>> functional testing. From a very limited comparison timing study in
>all
>>> virtual environment, it is lacking a bit in performance compared
>to
>>> non-RDMA mount (but it's better than software RoCE).
>> 
>> Excellent feed back, thank you.
>> 
>> Lets hear from NVMeof too please
>
>I actually took a stab and gave this a test drive with nvme/rdma
>and iser (thanks Steve for making our lives better with rdma tool add
>
>link support), think it was v6 though...
>
>There were some strange debug messages overlooked IIRC, and there
>were some error messages, but things worked so don't know what
>to make of it.
>
>Pretty much the same feedback here, very limited testing on my VMs
>shows:
>- functionally works
>- faster than rxe
>- slower than non-rdma (which sorta makes sense I assume)
>
>
Hi Sagi,

Many thanks for the feedback!

Performance was not my main concern since re-trying for acceptance
for upstream. I will look into perf tuning once we have it accepted.

One penalty we pay is - for HW interoperability - disabling
segmentation offloading awareness at sender side. While we could build
up to 64k frames in one shot (having it segmented on the wire by the NIC),
and process them same way in one shot at target side,
we don't do so, since some target iWarp hardware cannot handle MPA
frames larger than real MTU size. For siw - siw testing, we may switch
back on GSO awareness. These days, this is a compile time selection
only (since we abandoned all module parameters). Proposing another
extension of the netlink stuff for passing those driver private
parameters is on my todo list, but definitely not at the current stage.

In general, sitting on top of kernel TCP socket, adding some protocol
overhead, and even a 4 byte trailer checksum _after_ the data buffers
comes with a penalty, if the kernel application would otherwise use
the plain kernel TCP socket itself...

The performance story might be different for user level applications,
which potentially benefit more from the asynchronous verbs interface.


I learned Chelsio was doing some perf testing of NVMeF via siw against
iWarp HW themselves. They report line speed in a 100Gbs setup if siw is
on 2 clients side, talking to a T6 RNIC:
https://www.prnewswire.com/news-releases/chelsio-demonstrated-soft-iwarp-at-nvme-developer-days-300815249.html
and
https://www.chelsio.com/wp-content/uploads/resources/t6-100g-siw-nvmeof.pdf


Thanks,
Bernard.




[Index of Archives]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Photo]     [Yosemite News]     [Yosemite Photos]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux