Re: [PATCH rdma-rc v1] IB/IPoIB: Fix legacy IPoIB due to wrong number of queues

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, Jan 20, 2023 at 04:52:21PM -0400, Jason Gunthorpe wrote:
> On Fri, Jan 20, 2023 at 07:02:48PM +0200, Leon Romanovsky wrote:
> > From: Dragos Tatulea <dtatulea@xxxxxxxxxx>
> > 
> > The cited commit creates child PKEY interfaces over netlink will multiple
> > tx and rx queues, but some devices doesn't support more than 1 tx and 1 rx
> > queues. This causes to a crash when traffic is sent over the PKEY interface
> > due to the parent having a single queue but the child having multiple queues.
> > 
> > This patch inherits the real_num_tx/rx_queues from the parent netdev.
> > 
> > BUG: kernel NULL pointer dereference, address: 000000000000036b
> > PGD 0 P4D 0
> > Oops: 0000 [#1] SMP
> > CPU: 4 PID: 209665 Comm: python3 Not tainted 6.1.0_for_upstream_min_debug_2022_12_12_17_02 #1
> > Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014
> > RIP: 0010:kmem_cache_alloc+0xcb/0x450
> > Code: ce 7e 49 8b 50 08 49 83 78 10 00 4d 8b 28 0f 84 cb 02 00 00 4d 85 ed 0f 84 c2 02 00 00 41 8b 44 24 28 48 8d 4a 01 49 8b 3c 24 <49> 8b 5c 05 00 4c 89 e8 65 48 0f c7 0f 0f 94 c0 84 c0 74 b8 41 8b
> > RSP: 0018:ffff88822acbbab8 EFLAGS: 00010202
> > RAX: 0000000000000070 RBX: ffff8881c28e3e00 RCX: 00000000064f8dae
> > RDX: 00000000064f8dad RSI: 0000000000000a20 RDI: 0000000000030d00
> > RBP: 0000000000000a20 R08: ffff8882f5d30d00 R09: ffff888104032f40
> > R10: ffff88810fade828 R11: 736f6d6570736575 R12: ffff88810081c000
> > R13: 00000000000002fb R14: ffffffff817fc865 R15: 0000000000000000
> > FS:  00007f9324ff9700(0000) GS:ffff8882f5d00000(0000) knlGS:0000000000000000
> > CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > CR2: 000000000000036b CR3: 00000001125af004 CR4: 0000000000370ea0
> > DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> > DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> > Call Trace:
> >  <TASK>
> >  skb_clone+0x55/0xd0
> >  ip6_finish_output2+0x3fe/0x690
> >  ip6_finish_output+0xfa/0x310
> >  ip6_send_skb+0x1e/0x60
> >  udp_v6_send_skb+0x1e5/0x420
> >  udpv6_sendmsg+0xb3c/0xe60
> >  ? ip_mc_finish_output+0x180/0x180
> >  ? __switch_to_asm+0x3a/0x60
> >  ? __switch_to_asm+0x34/0x60
> >  sock_sendmsg+0x33/0x40
> >  __sys_sendto+0x103/0x160
> >  ? _copy_to_user+0x21/0x30
> >  ? kvm_clock_get_cycles+0xd/0x10
> >  ? ktime_get_ts64+0x49/0xe0
> >  __x64_sys_sendto+0x25/0x30
> >  do_syscall_64+0x3d/0x90
> >  entry_SYSCALL_64_after_hwframe+0x46/0xb0
> > RIP: 0033:0x7f9374f1ed14
> > Code: 42 41 f8 ff 44 8b 4c 24 2c 4c 8b 44 24 20 89 c5 44 8b 54 24 28 48 8b 54 24 18 b8 2c 00 00 00 48 8b 74 24 10 8b 7c 24 08 0f 05 <48> 3d 00 f0 ff ff 77 34 89 ef 48 89 44 24 08 e8 68 41 f8 ff 48 8b
> > RSP: 002b:00007f9324ff7bd0 EFLAGS: 00000293 ORIG_RAX: 000000000000002c
> > RAX: ffffffffffffffda RBX: 00007f9324ff7cc8 RCX: 00007f9374f1ed14
> > RDX: 00000000000002fb RSI: 00007f93000052f0 RDI: 0000000000000030
> > RBP: 0000000000000000 R08: 00007f9324ff7d40 R09: 000000000000001c
> > R10: 0000000000000000 R11: 0000000000000293 R12: 0000000000000000
> > R13: 000000012a05f200 R14: 0000000000000001 R15: 00007f9374d57bdc
> >  </TASK>
> > 
> > Fixes: dbc94a0fb817 ("IB/IPoIB: Fix queue count inconsistency for PKEY child interfaces")
> > Signed-off-by: Dragos Tatulea <dtatulea@xxxxxxxxxx>
> > Reviewed-by: Tariq Toukan <tariqt@xxxxxxxxxx>
> > Signed-off-by: Leon Romanovsky <leonro@xxxxxxxxxx>
> > ---
> > Changelog:
> > v1:
> >  * Fixed typo in warning print.
> > v0: https://lore.kernel.org/all/4a7ecec08ee30ad8004019818fadf1e58057e945.1674137153.git.leon@xxxxxxxxxx
> > ---
> >  drivers/infiniband/ulp/ipoib/ipoib_netlink.c | 12 ++++++++++++
> >  1 file changed, 12 insertions(+)
> > 
> > diff --git a/drivers/infiniband/ulp/ipoib/ipoib_netlink.c b/drivers/infiniband/ulp/ipoib/ipoib_netlink.c
> > index 9ad8d9856275..0548735a15b5 100644
> > --- a/drivers/infiniband/ulp/ipoib/ipoib_netlink.c
> > +++ b/drivers/infiniband/ulp/ipoib/ipoib_netlink.c
> > @@ -126,6 +126,18 @@ static int ipoib_new_child_link(struct net *src_net, struct net_device *dev,
> >  	} else
> >  		child_pkey  = nla_get_u16(data[IFLA_IPOIB_PKEY]);
> >  
> > +	err = netif_set_real_num_tx_queues(dev, pdev->real_num_tx_queues);
> > +	if (err) {
> > +		ipoib_warn(ppriv, "failed setting the child tx queue count based on parent\n");
> > +		return err;
> > +	}
> > +
> > +	err = netif_set_real_num_rx_queues(dev, pdev->real_num_rx_queues);
> > +	if (err) {
> > +		ipoib_warn(ppriv, "failed setting the child rx queue count based on parent\n");
> > +		return err;
> > +	}
> 
> This still seems flawed.. Netlink does this:
> 
> 	unsigned int num_rx_queues = 1;
> 
> 	if (tb[IFLA_NUM_RX_QUEUES])
> 		num_rx_queues = nla_get_u32(tb[IFLA_NUM_RX_QUEUES]);
> 	else if (ops->get_num_rx_queues)
> 		num_rx_queues = ops->get_num_rx_queues();
> 
> So num_rx_queues can really be any value that userspaces cares to
> provide.
> 
> If pdev->real_num_rx_queues is > the user provided value then
> netif_set_real_num_rx_queues() just fails.
> 
> So at a minimum this should min the actual number of queues requested
> against the maximum number of queues the driver can provide and use
> that to set the real queues.
> 
> And the return of a really big number from ops->get_num_rx_queues is
> pretty ugly too, ideally that would be fixed to pass in some function
> arguments and obtain the ppriv so it can return the actual maximum
> number of queues and we don't waste a bunch of memory..

.get_num_rx_queues() is declared as void, so it can't have any complex
logic except returns some global define.

Thanks

> 
> Jason



[Index of Archives]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Photo]     [Yosemite News]     [Yosemite Photos]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux