> Begin forwarded message: > > From: Chuck Lever <chuck.lever@xxxxxxxxxx> > Subject: Re: [PATCH net v2] net, sunrpc: Remap EPERM in case of connection failure in xs_tcp_setup_socket > Date: July 1, 2024 at 10:04:11 AM EDT > To: Simon Horman <horms@xxxxxxxxxx> > Cc: Daniel Borkmann <daniel@xxxxxxxxxxxxx>, Jakub Kicinski <kuba@xxxxxxxxxx>, Neil Brown <neilb@xxxxxxx>, Jeff Layton <jlayton@xxxxxxxxxx>, Olga Kornievskaia <kolga@xxxxxxxxxx>, Dai.Ngo@xxxxxxxxxx, Tom Talpey <tom@xxxxxxxxxx>, netdev <netdev@xxxxxxxxxxxxxxx>, bpf@xxxxxxxxxxxxxxx, Lex Siegel <usiegl00@xxxxxxxxx> > > > >> On Jul 1, 2024, at 6:57 AM, Simon Horman <horms@xxxxxxxxxx> wrote: >> >> On Fri, Jun 28, 2024 at 01:19:49PM -0400, Chuck Lever wrote: >>> On Fri, Jun 28, 2024 at 06:31:23PM +0200, Daniel Borkmann wrote: >>>> When using a BPF program on kernel_connect(), the call can return -EPERM. This >>>> causes xs_tcp_setup_socket() to loop forever, filling up the syslog and causing >>>> the kernel to potentially freeze up. >>>> >>>> Neil suggested: >>>> >>>> This will propagate -EPERM up into other layers which might not be ready >>>> to handle it. It might be safer to map EPERM to an error we would be more >>>> likely to expect from the network system - such as ECONNREFUSED or ENETDOWN. >>>> >>>> ECONNREFUSED as error seems reasonable. For programs setting a different error >>>> can be out of reach (see handling in 4fbac77d2d09) in particular on kernels >>>> which do not have f10d05966196 ("bpf: Make BPF_PROG_RUN_ARRAY return -err >>>> instead of allow boolean"), thus given that it is better to simply remap for >>>> consistent behavior. UDP does handle EPERM in xs_udp_send_request(). >>>> >>>> Fixes: d74bad4e74ee ("bpf: Hooks for sys_connect") >>>> Fixes: 4fbac77d2d09 ("bpf: Hooks for sys_bind") >>>> Co-developed-by: Lex Siegel <usiegl00@xxxxxxxxx> >>>> Signed-off-by: Lex Siegel <usiegl00@xxxxxxxxx> >>>> Signed-off-by: Daniel Borkmann <daniel@xxxxxxxxxxxxx> >>>> Link: https://github.com/cilium/cilium/issues/33395 >>>> Link: https://lore.kernel.org/bpf/171374175513.12877.8993642908082014881@xxxxxxxxxxxxxxxxxxxxx >>>> --- >>>> [ Fixes tags are set to the orig connect commit so that stable team >>>> can pick this up. ] >>>> >>>> v1 -> v2: >>>> - Plain resend, adding more sunrpc folks to Cc >>>> >>>> net/sunrpc/xprtsock.c | 7 +++++++ >>>> 1 file changed, 7 insertions(+) >>>> >>>> diff --git a/net/sunrpc/xprtsock.c b/net/sunrpc/xprtsock.c >>>> index dfc353eea8ed..0e1691316f42 100644 >>>> --- a/net/sunrpc/xprtsock.c >>>> +++ b/net/sunrpc/xprtsock.c >>>> @@ -2441,6 +2441,13 @@ static void xs_tcp_setup_socket(struct work_struct *work) >>>> transport->srcport = 0; >>>> status = -EAGAIN; >>>> break; >>>> + case -EPERM: >>>> + /* Happens, for instance, if a BPF program is preventing >>>> + * the connect. Remap the error so upper layers can better >>>> + * deal with it. >>>> + */ >>>> + status = -ECONNREFUSED; >>>> + fallthrough; >>>> case -EINVAL: >>>> /* Happens, for instance, if the user specified a link >>>> * local IPv6 address without a scope-id. >>>> -- >>>> 2.21.0 >>>> >>> >>> Hi Daniel - >>> >>> I know this is not documented in MAINTAINERS, but changes to >>> net/sunrpc/xprtsock.c go to Anna Schumaker and Trond Myklebust, >>> cc: linux-nfs@vger. >> >> Would it be possible to update MAINTAINERS accordingly? > > I can do it, of course, but I'd like to discuss this > with the NFS client maintainers to ensure they agree > on how the files are divided between the trees. Received a suggestion to put more detail in MAINTAINERS about how net/sunrpc/ is divided between the two NFS trees. Currently we have: NFS, SUNRPC, AND LOCKD CLIENTS T: git git://git.linux-nfs.org/projects/trondmy/linux-nfs.git F: include/linux/sunrpc/ F: include/uapi/linux/sunrpc/ F: net/sunrpc/ KERNEL NFSD, SUNRPC, AND LOCKD SERVERS T: git git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux.git F: include/linux/sunrpc/ F: include/uapi/linux/sunrpc/ F: net/sunrpc/ Which is considered unhelpful for drive-by contributions. At least for NFSD, we could make it: F: include/linux/sunrpc/svc* F: net/sunrpc/svc* F: net/sunrpc/xprtrdma/svc* Any other thoughts or suggestions? -- Chuck Lever