On 12/18/18 3:49 PM, Trond Myklebust wrote: > On Tue, 2018-12-18 at 09:45 +0300, Vasily Averin wrote: >> On 12/18/18 12:50 AM, J. Bruce Fields wrote: >>> On Mon, Dec 17, 2018 at 07:23:54PM +0300, Vasily Averin wrote: >>>> if node have NFSv41+ mounts inside several net namespaces >>>> it can lead to use-after-free in svc_process_common() >>>> >>>> svc_process_common() >>>> /* Setup reply header */ >>>> rqstp->rq_xprt->xpt_ops->xpo_prep_reply_hdr(rqstp); <<< >>>> HERE >>>> >>>> svc_process_common() can use already freed rqstp->rq_xprt, >>>> it was assigned in bc_svc_process() where it was taken from serv- >>>>> sv_bc_xprt. >>>> >>>> serv is global structure but sv_bc_xprt is assigned per- >>>> netnamespace, >>>> so if nfsv41+ shares are mounted in several containers together >>>> bc_svc_process() can use wrong backchannel or even access freed >>>> memory. >>>> >>>> To find correct svc_xprt of client-related backchannel >>>> bc_svc_process() now calls new .bc_get_xprt callback >>>> that executes svc_find_xprt() with proper xprt name. >>> >>> This stuff is confusing and I need to stare at it some more before >>> I >>> understand, but it's weird that we'd need to search for the right >>> xprt. >> >> All NFS clients in all net namespaces used the same minorversion >> shares common nfs_callback_data taken from global nfs_callback_info >> array. >> >> Moreover these clients can use either rdma or nfs transport, >> however only one of them can be used in one net namespace. >> >> Each net namespace must have own backchannel, >> it cannot depend on other net namespaces, >> because at least they can use different transports. >> >> So one svc_serv should be able to handle several (per-netns) >> backchannels. >> >> Frankly speaking If you prefer I can easily convert global >> nfs_callback_info to per net-namespace. >> I've checked, it works too. However current solution looks better for >> me. >> >>> We know which connection the backchannel request came over, and >>> there >>> should only be one backchannel using that connection, why can't we >>> find >>> it by just chasing pointers the right way? >> >> it is allocated by using follwing calltrace: >> nfs_callback_up >> nfs_callback_up_net >> xprt->ops->bc_up(serv, net) -> xs_tcp_bc_up >> svc_create_xprt(serv, "tcp-bc") >> __svc_xpo_create >> svc_bc_tcp_create >> svc_bc_create_socket >> >> Here backchannel's svc_sock/svc/xprt is created. >> It is per-netns and therefore it cannot be saved as pointer on global >> svc_serv. >> >> It could be saved on some xprt related to forechannel, >> I've expected it was done already -- but it was not done. >> I've tried to find any way to do it -- but without success, >> according structures seems are not accessible in svc_bc_tcp_create. >> >> Finally I've found that backchannel's xprt is added into serv- >>> sv_permsocks >> and svc_find_xprt can find it by name. >> >> It would be great if you can advise some more simple way. >> >>> OK, I do need to look at it more. >> >> It is quite important for containers so I think this patch (or any >> alternative solution) >> should be pushed in stable@. >> > > The whole "let's set up rqstp->rq_xprt for the back channel" is nothing > but a giant hack in order to work around the fact that > svc_process_common() uses it to find the xpt_ops, and perform a couple > of (meaningless for the back channel) tests of xpt_flags. > > What say we just pass in the xpt_ops as a parameter to > svc_process_common(), and make those xpt_flags tests check for whether > or not rqstp->rq_xprt is actually non-NULL? To access proper xpt_flags inside svc_process_common() we need to pass svc_xprt instead of xpt_ops. Do you mean something like following? --- a/net/sunrpc/svc.c +++ b/net/sunrpc/svc.c @@ -1148,7 +1148,7 @@ static __printf(2,3) void svc_printk(struct svc_rqst *rqstp, const char *fmt, .. * Common routine for processing the RPC request. */ static int -svc_process_common(struct svc_rqst *rqstp, struct kvec *argv, struct kvec *resv) +svc_process_common(struct svc_rqst *rqstp, struct kvec *argv, struct kvec *resv, struct svc_xprt *s_xprt) { struct svc_program *progp; const struct svc_version *versp = NULL; /* compiler food */ @@ -1172,7 +1172,7 @@ svc_process_common(struct svc_rqst *rqstp, struct kvec *argv, struct kvec *resv) clear_bit(RQ_DROPME, &rqstp->rq_flags); /* Setup reply header */ - rqstp->rq_xprt->xpt_ops->xpo_prep_reply_hdr(rqstp); + s_xprt->xpt_ops->xpo_prep_reply_hdr(rqstp); svc_putu32(resv, rqstp->rq_xid); @@ -1245,7 +1245,7 @@ svc_process_common(struct svc_rqst *rqstp, struct kvec *argv, struct kvec *resv) * fit. */ if (versp->vs_need_cong_ctrl && - !test_bit(XPT_CONG_CTRL, &rqstp->rq_xprt->xpt_flags)) + !test_bit(XPT_CONG_CTRL, &s_xprt->xpt_flags)) @@ -1336,8 +1336,8 @@ svc_process_common(struct svc_rqst *rqstp, struct kvec *argv, struct kvec *resv) return 0; close: - if (test_bit(XPT_TEMP, &rqstp->rq_xprt->xpt_flags)) - svc_close_xprt(rqstp->rq_xprt); + if (test_bit(XPT_TEMP, &s_xprt->xpt_flags)) + svc_close_xprt(s_xprt); dprintk("svc: svc_process close\n"); return 0; > It probably also requires us to store a pointer to struct net in the > struct svc_rqst so that nfs4_callback_compound() and > svcauth_gss_accept() can find it, but that should be OK since the > transport already has that referenced. > > Cheers, > Trond >