Re: problems with test_mr_rereg_pd

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




On 5/25/2023 1:22 AM, Bob Pearson wrote:
External email: Use caution opening links or attachments


Edward, Ido,

The test_mr_rereg_pd pyverbs test is failing for the rxe driver.

Does rxe even support rereg? This is what I get:

$ python3 tests/run_tests.py -v -k rereg_pd --dev rxe0 --gid 1
test_mr_rereg_pd (tests.test_mr.MRTest)
Test that cover rereg MR's PD with this flow: ... skipped 'Rereg MR is not supported (Failed to rereg MR: IBV_REREG_MR_ERR_CMD. Errno: 95, Operation not supported)'

(Your below suggested solution should be done anyway)

I have figured out that the problem is that the following sequence

     def test_mr_rereg_pd(self):
         """
         Test that cover rereg MR's PD with this flow:
         Use MR with QP that was created with the same PD. Then rereg the MR's PD
         and use the MR with the same QP, expect the traffic to fail with "remote
         operation error". Restate the QP from ERR state, rereg the MR back
         to its previous PD and use it again with the QP, verify that it now
         succeeds.
         """
         self.create_players(MRRes)
         u.traffic(**self.traffic_args)
         server_new_pd = PD(self.server.ctx)
         self.server.rereg_mr(flags=e.IBV_REREG_MR_CHANGE_PD, pd=server_new_pd)
         with self.assertRaisesRegex(PyverbsRDMAError, 'Remote operation error'):
             u.traffic(**self.traffic_args)
         self.restate_qps()
         self.server.rereg_mr(flags=e.IBV_REREG_MR_CHANGE_PD, pd=self.server.pd)
         u.traffic(**self.traffic_args)
         # Rereg the MR again with the new PD to cover
         # destroying a PD with a re-registered MR.
         self.server.rereg_mr(flags=e.IBV_REREG_MR_CHANGE_PD, pd=server_new_pd)

Schedules 10 iterations of a UD send to UD receive with an invalid mr pd which does not
match the qp pd. So it fails with a remote operation error on the first request.
The remaining 9 send and receive work requests are flushed to the caller with a
FLUSH_ERROR but not cleared out of the completion queues.

This is required by the IBA for Class A responder errors ("Remote operational error").
In C9-220 it requires:

       All other WQEs on both queues, and all WQEs subse-
       quently posted to either Queue, are completed with
       the “Completed - Flushed in Error” status

The final phase of the test wants to verify that after putting the original pd
back into the mr traffic works OK. But the remaining FLUSH errors in the completion
queues cause the test to fail.

To make this test work you would have to clean the completion queues as part of
restate_qps but that is not done.

Bob



[Index of Archives]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Photo]     [Yosemite News]     [Yosemite Photos]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux