problems with test_mr_rereg_pd

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Edward, Ido,

The test_mr_rereg_pd pyverbs test is failing for the rxe driver.
I have figured out that the problem is that the following sequence

    def test_mr_rereg_pd(self):
        """
        Test that cover rereg MR's PD with this flow:
        Use MR with QP that was created with the same PD. Then rereg the MR's PD
        and use the MR with the same QP, expect the traffic to fail with "remote
        operation error". Restate the QP from ERR state, rereg the MR back
        to its previous PD and use it again with the QP, verify that it now
        succeeds.
        """
        self.create_players(MRRes)
        u.traffic(**self.traffic_args)
        server_new_pd = PD(self.server.ctx)
        self.server.rereg_mr(flags=e.IBV_REREG_MR_CHANGE_PD, pd=server_new_pd)
        with self.assertRaisesRegex(PyverbsRDMAError, 'Remote operation error'):
            u.traffic(**self.traffic_args)
        self.restate_qps()
        self.server.rereg_mr(flags=e.IBV_REREG_MR_CHANGE_PD, pd=self.server.pd)
        u.traffic(**self.traffic_args)
        # Rereg the MR again with the new PD to cover
        # destroying a PD with a re-registered MR.
        self.server.rereg_mr(flags=e.IBV_REREG_MR_CHANGE_PD, pd=server_new_pd)

Schedules 10 iterations of a UD send to UD receive with an invalid mr pd which does not
match the qp pd. So it fails with a remote operation error on the first request.
The remaining 9 send and receive work requests are flushed to the caller with a
FLUSH_ERROR but not cleared out of the completion queues.

This is required by the IBA for Class A responder errors ("Remote operational error").
In C9-220 it requires:

      All other WQEs on both queues, and all WQEs subse-
      quently posted to either Queue, are completed with
      the “Completed - Flushed in Error” status

The final phase of the test wants to verify that after putting the original pd
back into the mr traffic works OK. But the remaining FLUSH errors in the completion
queues cause the test to fail.

To make this test work you would have to clean the completion queues as part of
restate_qps but that is not done.

Bob



[Index of Archives]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Photo]     [Yosemite News]     [Yosemite Photos]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux