RE: [PATCH bpf-next 1/2] sk_msg: Keep reference on socket file while wait_memory

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



liujian (CE) wrote:
> 
> 
> > -----Original Message-----
> > From: John Fastabend [mailto:john.fastabend@xxxxxxxxx]
> > Sent: Wednesday, August 17, 2022 8:55 AM
> > To: liujian (CE) <liujian56@xxxxxxxxxx>; john.fastabend@xxxxxxxxx;
> > jakub@xxxxxxxxxxxxxx; edumazet@xxxxxxxxxx; davem@xxxxxxxxxxxxx;
> > yoshfuji@xxxxxxxxxxxxxx; dsahern@xxxxxxxxxx; kuba@xxxxxxxxxx;
> > pabeni@xxxxxxxxxx; andrii@xxxxxxxxxx; mykolal@xxxxxx; ast@xxxxxxxxxx;
> > daniel@xxxxxxxxxxxxx; martin.lau@xxxxxxxxx; song@xxxxxxxxxx; yhs@xxxxxx;
> > kpsingh@xxxxxxxxxx; sdf@xxxxxxxxxx; haoluo@xxxxxxxxxx;
> > jolsa@xxxxxxxxxx; shuah@xxxxxxxxxx; bpf@xxxxxxxxxxxxxxx
> > Cc: liujian (CE) <liujian56@xxxxxxxxxx>
> > Subject: RE: [PATCH bpf-next 1/2] sk_msg: Keep reference on socket file
> > while wait_memory
> > 
> > Liu Jian wrote:
> > > Fix the below NULL pointer dereference:
> > >
> > > [   14.471200] Call Trace:
> > > [   14.471562]  <TASK>
> > > [   14.471882]  lock_acquire+0x245/0x2e0
> > > [   14.472416]  ? remove_wait_queue+0x12/0x50
> > > [   14.473014]  ? _raw_spin_lock_irqsave+0x17/0x50
> > > [   14.473681]  _raw_spin_lock_irqsave+0x3d/0x50
> > > [   14.474318]  ? remove_wait_queue+0x12/0x50
> > > [   14.474907]  remove_wait_queue+0x12/0x50
> > > [   14.475480]  sk_stream_wait_memory+0x20d/0x340
> > > [   14.476127]  ? do_wait_intr_irq+0x80/0x80
> > > [   14.476704]  do_tcp_sendpages+0x287/0x600
> > > [   14.477283]  tcp_bpf_push+0xab/0x260
> > > [   14.477817]  tcp_bpf_sendmsg_redir+0x297/0x500
> > > [   14.478461]  ? __local_bh_enable_ip+0x77/0xe0
> > > [   14.479096]  tcp_bpf_send_verdict+0x105/0x470
> > > [   14.479729]  tcp_bpf_sendmsg+0x318/0x4f0
> > > [   14.480311]  sock_sendmsg+0x2d/0x40
> > > [   14.480822]  ____sys_sendmsg+0x1b4/0x1c0
> > > [   14.481390]  ? copy_msghdr_from_user+0x62/0x80
> > > [   14.482048]  ___sys_sendmsg+0x78/0xb0
> > > [   14.482580]  ? vmf_insert_pfn_prot+0x91/0x150
> > > [   14.483215]  ? __do_fault+0x2a/0x1a0
> > > [   14.483738]  ? do_fault+0x15e/0x5d0
> > > [   14.484246]  ? __handle_mm_fault+0x56b/0x1040
> > > [   14.484874]  ? lock_is_held_type+0xdf/0x130
> > > [   14.485474]  ? find_held_lock+0x2d/0x90
> > > [   14.486046]  ? __sys_sendmsg+0x41/0x70
> > > [   14.486587]  __sys_sendmsg+0x41/0x70
> > > [   14.487105]  ? intel_pmu_drain_pebs_core+0x350/0x350
> > > [   14.487822]  do_syscall_64+0x34/0x80
> > > [   14.488345]  entry_SYSCALL_64_after_hwframe+0x63/0xcd
> > >
> > > The test scene as following flow:
> > > thread1                               thread2
> > > -----------                           ---------------
> > >  tcp_bpf_sendmsg
> > >   tcp_bpf_send_verdict
> > >    tcp_bpf_sendmsg_redir              sock_close
> > >     tcp_bpf_push_locked                 __sock_release
> > >      tcp_bpf_push                         //inet_release
> > >       do_tcp_sendpages                    sock->ops->release
> > >        sk_stream_wait_memory          	   // tcp_close
> > >           sk_wait_event                      sk->sk_prot->close
> > >            release_sock(__sk);
> > >             ***
> > >
> > >                                                 lock_sock(sk);
> > >                                                   __tcp_close
> > >                                                     sock_orphan(sk)
> > >                                                       sk->sk_wq  = NULL
> > >                                                 release_sock
> > >             ****
> > >            lock_sock(__sk);
> > >           remove_wait_queue(sk_sleep(sk), &wait);
> > >              sk_sleep(sk)
> > >              //NULL pointer dereference
> > >              &rcu_dereference_raw(sk->sk_wq)->wait
> > >
> > > While waiting for memory in thread1, the socket is released with its
> > > wait queue because thread2 has closed it. This caused by
> > > tcp_bpf_send_verdict didn't increase the f_count of psock->sk_redir-
> > >sk_socket->file in thread1.
> > >
> > > Avoid it by keeping a reference to the socket file while redirect sock
> > > wait send memory. Refer to [1].
> > >
> > > [1]
> > > https://lore.kernel.org/netdev/20190211090949.18560-1-jakub@cloudflare
> > > .com/
> > >
> > > Signed-off-by: Liu Jian <liujian56@xxxxxxxxxx>
> > 
> > Thanks for the detailed commit message its necessary to understand the
> > problem without spending hours deciphering it myself.
> > 
> > When I looked at [1] we solved a simliar problem by using the
> > MSG_DONTWAIT flag so that the error was pushed back to the sending.
> > 
> > Can we do the same thing here? The nice bit here is the error would get all
> > the way back to the sending socket so userspace could decide how to handle
> > it? Did I miss something?
> > 
> [1] works in sk_psock_backlog function, it can not be detected by the userspace app.
> But here, the problem is that app wants this to be a blocked system call.
> If the MSG_DONTWAIT flag is forcibly added, this changes the function behavior.
> 

Ah right. We could push it to the sk_psock_backlog as another option similar
to what sk_psock_verdict_apply() does with sk_psock_skb_redirect(). The
problem would be we don't have an skb here and then instead of the
stream wait logic we would be using the backlog logic which might create
some subtle change. Seems a bit intrusive to me.

I don't have any better ideas off-hand even though reaching into the file
like above in the patch is not ideal.

Maybe Jakub has some thoughts?

Thanks!
John



[Index of Archives]     [Linux Samsung SoC]     [Linux Rockchip SoC]     [Linux Actions SoC]     [Linux for Synopsys ARC Processors]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]


  Powered by Linux