Re: [PATCH] xprtrdma: Make sure Send CQ is allocated on an existing CPU

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




> On Jan 23, 2019, at 12:07 PM, Nicolas Morey-Chaisemartin <nmoreychaisemartin@xxxxxxx> wrote:
> 
> 
> 
> On 1/23/19 6:06 PM, Nicolas Morey-Chaisemartin wrote:
>> 
>> On 1/23/19 5:51 PM, Chuck Lever wrote:
>>> Hi Nicolas-
>>> 
>>>> On Jan 23, 2019, at 8:12 AM, Nicolas Morey-Chaisemartin <nmoreychaisemartin@xxxxxxxx> wrote:
>>>> 
>>>> Make sure host has at least 2 CPU before allocating to CPU#1
>>> The fourth parameter of ib_alloc_cq() is not a CPU number,
>>> it's a completion vector number. What failure did you see
>>> that prompted this patch?
>> When trying to mount, I get this:
>> + mount -o rdma,port=20049 192.168.20.15:/tmp/RAM /tmp/RAM
>> mount.nfs: mounting 192.168.20.15:/tmp/RAM failed, reason given by server: No such file or directory
>> 
>> Digging a bit into the code, it appears that the cq allocation here returns a ENOENT which come from mlx5_vector2eqn.
>> On my system (VM with a mlx5 card with SRIOV), the comp_eqs_list only contains one entry with index == 0
>> 
>> Nicolas
>> 
> 
> Also, adding a 2nd core to my VM fixes the issue (thus my understanding that it was a CPU number)

Fair enough. The 2nd CPU adds a 2nd compvec. Instead of
num_cpus_online() you want ib_device::num_comp_vectors.

I suspect there's a spiffier way to go about this these
days thanks to ib_get_vector_affinity, but you've found
a longstanding bug. So let's get something that can be
comfortably backported to stable.


--
Chuck Lever







[Index of Archives]     [Linux Filesystem Development]     [Linux USB Development]     [Linux Media Development]     [Video for Linux]     [Linux NILFS]     [Linux Audio Users]     [Yosemite Info]     [Linux SCSI]

  Powered by Linux