Re: Memory window support for rdma_rxe

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 8/20/20 2:27 PM, Bob Pearson wrote:
> On 8/20/20 2:41 AM, Leon Romanovsky wrote:
>> On Wed, Aug 19, 2020 at 11:36:54AM -0500, Bob Pearson wrote:
>>> On 8/19/20 12:02 AM, Leon Romanovsky wrote:
>>>> On Tue, Aug 18, 2020 at 10:39:46PM -0500, Bob Pearson wrote:
>>>>> This a cleaned up resend of an earlier patch set. This set of patches
>>>>> implements the memory windows verbs and local send operations. Each of these
>>>>> has been tested at a basic level and regressions tests have been run to
>>>>> see that basic rxe functionality is OK.
>>>>
>>>> Can you please submit the series together with standard cover-letter
>>>> (git format-patch --cover-letter ..) that include diffstat and patch
>>>> list.
>>>>
>>>> It is helpful to see the whole picture of expected changes.
>>>>
>>>> Does it pass rdma-core pyverbs tests?
>>>>
>>>> Thanks
>>>>
>>> Leon,
>>>
>>> Thanks for the comments. They are helpful. I haven't worked on rxe or anything else in Linux for about 6-7 years so there are a lot of things that have changed. I have a few questions that you may be able to answer.
>>>
>>> The build robot seems to be catching things that make in the kernel tree is missing (I think.) Is there a way to check if patches will work before sending them in an email? The most recent attempt had a stray variable declaration left over from some other change but I never saw a compiler warning.
>>
>> You can catch most (90%) of errors reported by kbuild if you use
>> latest GCC compiler to prepare your patches. Latest Fedora (32) has
>> it. Compile your code with allyesconfig, allmodconfig and allnoconfig.
>>
>> Rest of errors you can find with smatch and sparse tools.
>>
>>>
>>> I had used --compose rather than --cover-letter and wondered how people got those nice [PATCH 0/N] messages. I'll give it a try.
>>>
>>> I've never come to terms with Python (white space shouldn't carry syntax IMHO) and have no idea what pyverbs is doing. How do you run the tests you mention?
>>
>> https://github.com/linux-rdma/rdma-core/blob/master/Documentation/testing.md#how-to-run-rdma-cores-tests
>> Bottom line:
>> 1. Download rdma-core
>> 2. Compile on the system with your rxe device, use build.sh script in
>> source code
>> 3. Run the tests directly from the source code
>> ./build/bin/run_tests.py -v
>>
>>>
>>> I tried to get git send-email to put a version number into the subject lines with -v2 which it happily accepts but it does nothing. In the end I had to edit each email one at a time. Is there an easier way to get e.g. [PATCH v3 xx/yy]?
>>
>> It is done during format-patch stage, my command line for the series is;
>> git format-patch --cover-letter -M -C -v X --subject-prefix "PATCH $TARGET" -o /tmp/
>>                                      ^^^^ version                 ^^^^ rdma-next or rdma-rc
>>
>>>
>>> Thanks for the help,
>>>
>>> Bob Pearson
> Interesting. I fairly easily got the tests working but have found bugs in error cases in the response state machine that I'll have to fix. The test behaves badly (perhaps on purpose) by deallocating the MWs and then banging away sending writes to the now defunct MW. The responder should nak the rkey violation but doesn't. The cause of that is that do_complete assumes that no errors ever occur and skips out if there isn't receive wqe to complete bypassing the ACKNOWLEDGE state. This should also have been seen for MRs if anyone ever did the same thing.
> 
> Bob 
> 
The run_tests.py tests are mostly running. There are four test cases that always fail (AH, and mcast) but have nothing to do with MWs. And there are occasional other failures from INIT->RTR QP transition timeouts failures. These are not reproducible and occur on various tests. I do not believe this has anything to do with the MW code either. It never gets there when it happens to be a MW case.

There were three issues with the MW code that are fixed now. One was a use before set of a pointer, one was a difference of interpretation of the IBA specs (I wasn't allowing invalidation of a MW unless it was valid), the last was was the missing acks described above.

Do you know if this is normal behavior for rxe?

I am going to post v3 patch set now.

Bob



[Index of Archives]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Photo]     [Yosemite News]     [Yosemite Photos]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux