Re: [PATCH v3] IB/iser: Fix RNR errors

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 





On 5/1/2018 8:04 PM, Doug Ledford wrote:
On Mon, 2018-04-30 at 13:31 +0300, Max Gurtovoy wrote:
Jason/Doug/Leon,
can you please apply this patch ?
Sergey made the needed changes.

-Max.

On 4/16/2018 6:00 PM, Sergey Gorenko wrote:
Some users complain about RNR errors on the target, when heavy
high-priority tasks run on the initiator. After the
investigation, we found out that the receive WRs were exhausted,
because the initiator could not post them on time.

Receive work reqeusts are posted in chunks of min_posted_rx to
reduce the number of hits to the HCA. The WRs are posted in the
receive completion handler when the number of free receive buffers
reaches min_posted_rx. But on a high-loaded host, receive CQEs
processing can be delayed and all receive WRs will be exhausted.
In this case, the target will get an RNR error.

To avoid this, we post receive WR, as soon as at least one receive
buffer is freed. This increases the number of hits to the HCA, but
test results show that performance degradation is not significant.

Performance results running fio (8 jobs, 64 iodepth) using ramdisk
(w/w.o patch):

bs      IOPS(randread)    IOPS(randwrite)
------  ---------------   ---------------
512     329.4K / 340.3K   379.1K / 387.7K

3% performance drop

1K      333.4K / 337.6K   364.2K / 370.0K

1.4% performance drop

2K      321.1K / 323.5K   334.2K / 337.8K

.75% performance drop

I know you said the performance hit was not significant, and by the time
you get to 2k reads/writes, I agree with you, but at the low end, I'm
not sure you can call 3% "not significant".

Is a 3% performance hit better than the transient rnr error?  And is
this the only solution available?

This problem here is the fact that iSER is using 1 QP per session and then you see 3% hit. I guess if we run a test using 64 targets and 64 sessions (64 QPs) then the locality of the completions would have smooth this hit away. I agree that 3% hit per this scenario is not optimal. I also hope that the work that IdanB and myself did with mlx5 inline KLM/MTT will balance this one as well (Currently it should be pushed in Leon/Jason PR. We saw > x3.5 improvement for small IOs in NVME-oF) - Sergey, we can check this out in our lab.

For conclusion, IMO we have few improvments in other patches/drivers that can balance this 3% hit for small IOs and we can also make optimizations for performance in the future (e.g. adaptive CQ moderation, likely/unlikely prefixes, etc..). This is exactly how NVME-oF process the recv completions and we don't complain there :). Also, we recently fixed the post send in NVME-oF to signal on each completion and this caused a hit in the performance either but we overcome it using other improvments (BTW, this signaling fix is on our plate for iSER as well).



4K      300.7K / 302.9K   290.2K / 291.6K
8K      235.9K / 237.1K   228.2K / 228.8K
16K     176.3K / 177.0K   126.3K / 125.9K
32K     115.2K / 115.4K    82.2K / 82.0K
64K      70.7K / 70.7K     47.8K / 47.6K
128K     38.5K / 38.6K     25.8K / 25.7K
256K     20.7K / 20.7K     13.7K / 13.6K
512K     10.0K / 10.0K      7.0K / 7.0K

Signed-off-by: Sergey Gorenko <sergeygo@xxxxxxxxxxxx>
Signed-off-by: Vladimir Neyelov <vladimirn@xxxxxxxxxxxx>
Reviewed-by: Max Gurtovoy <maxg@xxxxxxxxxxxx>
---


-Max.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Photo]     [Yosemite News]     [Yosemite Photos]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux