Re: [PATCH 1/4] RDMA/rxe: Avoid double-free panic on transmit error

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 10/30/18, 7:03 PM, "Yanjun Zhu" <yanjun.zhu@xxxxxxxxxx> wrote:
    Hi,
    
    Do you use the latest linux kernel?
    
    I can not apply your patch into the latest linux kernel.
    
    And it seems that your kfree_skb is fixed in the latest linux kernel.
    
    Zhu Yanjun

Hello Zhu,
It is based off of rdma/for-next, but I can see that my branch is a bit behind. I will rebase and send an update based on the comments so far.

Thanks!
Andrew
    
    On 2018/10/30 21:54, Andrew Boyer wrote:
    > A previous commit removed the skb_clone() from rxe_send(). Instead, the
    > original skb is passed to ip[6]_local_out(). Thus, the network stack owns
    > the original skb, and the error handling code after rxe_xmit_packet() must
    > not try to free it again.
    >
    > [  306.296924] kernel BUG at ../source/mm/slub.c:295!
    > [  306.298438] invalid opcode: 0000 [#1] SMP NOPTI
    > [  306.351552] task: ffff94e7ee271e40 task.stack: ffffa43382db4000
    > [  306.353363] RIP: 0010:kfree+0x173/0x180
    > [  306.354516] RSP: 0018:ffffa43382db7b60 EFLAGS: 00010246
    > [  306.356063] RAX: ffff94e6ab96ee00 RBX: ffff94e6ab96ee00 RCX: ffff94e6ab96ee00
    > [  306.358193] RDX: 0000000000007183 RSI: ffff94e7ffd24be0 RDI: ffff94e7f7003080
    > [  306.360327] RBP: ffff94e7ef963800 R08: ffffffffaac14e40 R09: ffffffffaa4c87de
    > [  306.362473] R10: ffffe13b84ae5b80 R11: ffff94e7ef963800 R12: ffffffffaa4c87de
    > [  306.364725] R13: 0000000000000000 R14: ffff94e7a4236590 R15: ffff94e7f0dc0000
    > [  306.366845] FS:  00007ff34f3e8700(0000) GS:ffff94e7ffd00000(0000) knlGS:0000000000000000
    > [  306.369297] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    > [  306.371056] CR2: 00007ff347837080 CR3: 0000000146490002 CR4: 00000000001606e0
    > [  306.373317] Call Trace:
    > [  306.374222]  ? rxe_requester+0xea7/0x16e0 [rdma_rxe]
    > [  306.375750]  kfree_skb+0x2e/0x90
    > [  306.376725]  rxe_requester+0xea7/0x16e0 [rdma_rxe]
    > [  306.378158]  rxe_do_task+0x85/0xf0 [rdma_rxe]
    > [  306.379521]  rxe_queue_resize+0x8d1/0x1d30 [rdma_rxe]
    > [  306.381041]  ? ib_copy_path_rec_to_user+0x54b/0x7d0 [ib_uverbs]
    > [  306.382781]  ib_uverbs_post_send+0x5ac/0x680 [ib_uverbs]
    > [  306.384438]  ? dequeue_entity+0x539/0xad0
    > [  306.385618]  0xffffffffc09593b2
    > [  306.386548]  ? __schedule+0x2eb/0x890
    > [  306.387669]  ? hrtimer_start_range_ns+0x19e/0x3a0
    > [  306.389100]  __vfs_write+0x33/0x170
    > [  306.390274]  ? __inode_security_revalidate+0x4a/0x70
    > [  306.391739]  ? selinux_file_permission+0xdd/0x130
    > [  306.393145]  ? security_file_permission+0x36/0xb0
    > [  306.394561]  vfs_write+0xb3/0x1a0
    > [  306.395950]  SyS_write+0x52/0xc0
    > [  306.397180]  do_syscall_64+0x66/0x1d0
    > [  306.398615]  entry_SYSCALL_64_after_hwframe+0x21/0x86
    > [  306.400245] RIP: 0033:0x7ff34f12a79d
    > [  306.401343] RSP: 002b:00007ff337e406f0 EFLAGS: 00000293 ORIG_RAX: 0000000000000001
    > [  306.403938] RAX: ffffffffffffffda RBX: 0000000032de2970 RCX: 00007ff34f12a79d
    > [  306.406014] RDX: 0000000000000020 RSI: 00007ff337e40730 RDI: 000000000000003b
    > [  306.408134] RBP: 0000000000000000 R08: 0000000000000508 R09: 00007ff34ad1cee0
    > [  306.410278] R10: 0000000000000006 R11: 0000000000000293 R12: 0000000000000001
    > [  306.412482] R13: 00007ff316d28280 R14: 000000000000004c R15: 0000000000000000
    > [  306.414671] Code: 80 74 04 41 8b 72 6c 5b 5d 41 5c 4c 89 d7 e9 f5 b0 f9 ff 48 89 d9 48 89 da 41 b8 01 00 00 00 5b 5d 41 5c 4c 89 d6 e9 0d f6 ff ff <0f> 0b 90 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 41 57 41
    > [  306.422327] RIP: kfree+0x173/0x180 RSP: ffffa43382db7b60
    >
    > Fixes: 5793b4652155 ("IB/rxe: remove unnecessary skb_clone in xmit")
    >
    > Signed-off-by: Andrew Boyer <andrew.boyer@xxxxxxxx>
    > ---
    >   drivers/infiniband/sw/rxe/rxe_loc.h  | 1 +
    >   drivers/infiniband/sw/rxe/rxe_req.c  | 3 +--
    >   drivers/infiniband/sw/rxe/rxe_resp.c | 7 +------
    >   3 files changed, 3 insertions(+), 8 deletions(-)
    >
    > diff --git a/drivers/infiniband/sw/rxe/rxe_loc.h b/drivers/infiniband/sw/rxe/rxe_loc.h
    > index b71023c1c58b..f139ea2e3c1b 100644
    > --- a/drivers/infiniband/sw/rxe/rxe_loc.h
    > +++ b/drivers/infiniband/sw/rxe/rxe_loc.h
    > @@ -254,6 +254,7 @@ static inline unsigned int wr_opcode_mask(int opcode, struct rxe_qp *qp)
    >   	return rxe_wr_opcode_info[opcode].mask[qp->ibqp.qp_type];
    >   }
    >   
    > +/* The caller must not touch the skb after calling this function */
    >   static inline int rxe_xmit_packet(struct rxe_dev *rxe, struct rxe_qp *qp,
    >   				  struct rxe_pkt_info *pkt, struct sk_buff *skb)
    >   {
    > diff --git a/drivers/infiniband/sw/rxe/rxe_req.c b/drivers/infiniband/sw/rxe/rxe_req.c
    > index 7bdaf71b8221..78210c1d15d8 100644
    > --- a/drivers/infiniband/sw/rxe/rxe_req.c
    > +++ b/drivers/infiniband/sw/rxe/rxe_req.c
    > @@ -709,6 +709,7 @@ int rxe_requester(void *arg)
    >   
    >   	if (fill_packet(qp, wqe, &pkt, skb, payload)) {
    >   		pr_debug("qp#%d Error during fill packet\n", qp_num(qp));
    > +		kfree_skb(skb);
    >   		goto err;
    >   	}
    >   
    > @@ -728,7 +729,6 @@ int rxe_requester(void *arg)
    >   		rollback_state(wqe, qp, &rollback_wqe, rollback_psn);
    >   
    >   		if (ret == -EAGAIN) {
    > -			kfree_skb(skb);
    >   			rxe_run_task(&qp->req.task, 1);
    >   			goto exit;
    >   		}
    > @@ -741,7 +741,6 @@ int rxe_requester(void *arg)
    >   	goto next_wqe;
    >   
    >   err:
    > -	kfree_skb(skb);
    >   	wqe->status = IB_WC_LOC_PROT_ERR;
    >   	wqe->state = wqe_state_error;
    >   	__rxe_do_task(&qp->comp.task);
    > diff --git a/drivers/infiniband/sw/rxe/rxe_resp.c b/drivers/infiniband/sw/rxe/rxe_resp.c
    > index a65c9969f7fc..871bd6d8a11c 100644
    > --- a/drivers/infiniband/sw/rxe/rxe_resp.c
    > +++ b/drivers/infiniband/sw/rxe/rxe_resp.c
    > @@ -742,7 +742,6 @@ static enum resp_states read_reply(struct rxe_qp *qp,
    >   	err = rxe_xmit_packet(rxe, qp, &ack_pkt, skb);
    >   	if (err) {
    >   		pr_err("Failed sending RDMA reply.\n");
    > -		kfree_skb(skb);
    >   		return RESPST_ERR_RNR;
    >   	}
    >   
    > @@ -954,11 +953,8 @@ static int send_ack(struct rxe_qp *qp, struct rxe_pkt_info *pkt,
    >   	}
    >   
    >   	err = rxe_xmit_packet(rxe, qp, &ack_pkt, skb);
    > -	if (err) {
    > +	if (err)
    >   		pr_err_ratelimited("Failed sending ack\n");
    > -		kfree_skb(skb);
    > -	}
    > -
    >   err1:
    >   	return err;
    >   }
    > @@ -1141,7 +1137,6 @@ static enum resp_states duplicate_request(struct rxe_qp *qp,
    >   			if (rc) {
    >   				pr_err("Failed resending result. This flow is not handled - skb ignored\n");
    >   				rxe_drop_ref(qp);
    > -				kfree_skb(skb_copy);
    >   				rc = RESPST_CLEANUP;
    >   				goto out;
    >   			}
    
    





[Index of Archives]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Photo]     [Yosemite News]     [Yosemite Photos]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux