Re: [PATCH v3 0/1] RDMA/rxe: Bump up default maximum values used via uverbs

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




On 8/3/21 7:21 PM, Zhu Yanjun wrote:
On Wed, Aug 4, 2021 at 10:03 AM Shoaib Rao <rao.shoaib@xxxxxxxxxx> wrote:

On 8/3/21 5:51 PM, Zhu Yanjun wrote:
On Wed, Aug 4, 2021 at 7:53 AM Shoaib Rao <rao.shoaib@xxxxxxxxxx> wrote:
Hi Zhu,

Any update on your testing after applying Bob's fixes
Do you read my problem carefully?
I mean that before your commit, the whole rxe can work well.
After your commit, the rxe can not work well.
Please reproduce this problem in your host and fix it.

Zhu Yanjun
You posted

In my daily tests, I found that one host 5.12-stable, the other host
is 5.14.-rc3 + this commit.
rping can not work. Sometimes crash will occur.

It seems that changing maximum values breaks backward compatibility.

But without this commit, that is, 5.12-stable <-------> 5.14-rc3,
rping can work well.

Zhu Yanjun
I am not sure how you made rxe to work because it did not work for me
and neither for Bob. Since then, Bob has posted patches for the issue. I
also posted that my changes work on 5.13.6 kernel. emails attached.

Even if rxe in 5.14 is working for you some how, please apply Bob's
patches and then mine and test.
I have already applied this commit
https://urldefense.com/v3/__https://patchwork.kernel.org/project/linux-rdma/patch/20210729220039.18549-3-rpearsonhpe@xxxxxxxxx/__;!!ACWV5N9M2RV99hQ!b2c47MGvP_kCr0tkQgySPZaB3QX3DMeh4l_iwAS3IQHh9R589oF9BWrcgftcidGA$ .

And with your commit, rxe can not work well.

Zhu Yanjun

I am not sure how anyone can claim that the code works without my changes. Rxe in Linux 5.14-rc4 is broken due to following change made to rxe_cq_post() and will cause panic or corruption guaranteed.

addr = producer_addr(cq->queue, QUEUE_TYPE_TO_CLIENT);

It should be

addr = producer_addr(cq->queue, QUEUE_TYPE_FROM_CLIENT);

The following function also seems wrong

static inline void *producer_addr(struct rxe_queue *q, enum queue_type type)
{
        u32 prod;

        switch (type) {
        case QUEUE_TYPE_FROM_CLIENT:
                /* protect user space index */
                prod = smp_load_acquire(&q->buf->producer_index);
                prod &= q->index_mask;
                break;
        case QUEUE_TYPE_TO_CLIENT:
                prod = q->index;
                break;
        }

        return q->buf->data + (prod << q->log2_elem_size);
}
index should be returned as it is.

The code has changed again in v5.14-rc4-22-g251a1524293d, so now I have to try again.

Can we please make sure that the code is working after the application of each patch or else it is a moving target.

BTW I liked the old code as it distinctly said what was being returned.

Shoaib


Thanks,

Shoaib


Shoaib

On 7/29/21 5:34 PM, Shoaib Rao wrote:
Thanks Bob.

Zhu can you please apply those patches and test.

Shoaib

On 7/29/21 4:08 PM, Pearson, Robert B wrote:
I found another rxe bug (for SRQ) and sent three bug fixes in a set
including the one you mention. They should all be applied.

-----Original Message-----
From: Jason Gunthorpe <jgg@xxxxxxxx>
Sent: Thursday, July 29, 2021 2:51 PM
To: Shoaib Rao <rao.shoaib@xxxxxxxxxx>
Cc: Zhu Yanjun <zyjzyj2000@xxxxxxxxx>; RDMA mailing list
<linux-rdma@xxxxxxxxxxxxxxx>
Subject: Re: [PATCH v3 0/1] RDMA/rxe: Bump up default maximum values
used via uverbs

On Thu, Jul 29, 2021 at 12:33:14PM -0700, Shoaib Rao wrote:

Can we please accept my initial patch where I bumped up the values of
a few parameters. We have extensively tested with those values. I will
try to resolve CRC errors and panic and make changes to other
tuneables later?
I think Bob posted something for the icrc issues already

Please try to work in a sane fashion, rxe shouldn't be left broken
with so many people apparently interested in it??

Jason



[Index of Archives]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Photo]     [Yosemite News]     [Yosemite Photos]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux