Hi, > From: Serge E. Hallyn <serge@xxxxxxxxxx> > Sent: Monday, March 10, 2025 7:01 PM > > On Sat, Mar 08, 2025 at 08:06:02PM +0200, Parav Pandit wrote: > > A process running in a non-init user namespace possesses the > > CAP_NET_RAW capability. However, the patch cited in the fixes tag > > checks the capability in the default init user namespace. > > Because of this, when the process was started by Podman in a > > non-default user namespace, the flow creation failed. > > > > Fix this issue by checking the CAP_NET_RAW networking capability in > > the owner user namespace that created the network namespace. > > Hi, > > you say > > > Fix this issue by checking the CAP_NET_RAW networking capability > in the > owner user namespace that created the network namespace. > > But in fact you are checking the CAP_NET_RAW against the user's network > namespace. I didn't understand your comment. The fix takes the current process's network namespace by referring to current->nsproxy->net_ns. Each net ns has its owning user namespace who has created it. So the patch is checking caps in the such user namespace. Can you please elaborate? > That is usually not the same thing, although it is possible that in > this case it is. > > What is cmd.flow_id? Is that guaranteed to represent something in the > current process' network namespace? The flow_id is namespaced in the rdma device. A network namespace can have multiple devices. So it cannot be unique because an rdma device in different namespace may have same flow_id. > Or is it possible that a user without > privilege in his user namespace could unshare userns+netns but then cause > this fn to be called against a flow in the original network namespace? > It can do unshare user_ns+netns, but I fail to see it can refer to the original (previous net_ns) in this call, because this call is always refers to current->nsproxy->net_us, which means it operates on the latest net ns (after unshare). > > > > This change is similar to the following cited patches. > > > > commit 5e1fccc0bfac ("net: Allow userns root control of the core of > > the network stack.") commit 52e804c6dfaa ("net: Allow userns root to > > control ipv4") commit 59cd7377660a ("net: openvswitch: allow conntrack > > in non-initial user namespace") commit 0a3deb11858a ("fs: Allow > > listmount() in foreign mount namespace") commit dd7cb142f467 ("fs: > > relax permissions for listmount()") > > > > Fixes: c938a616aadb ("IB/core: Add raw packet QP type") > > Signed-off-by: Parav Pandit <parav@xxxxxxxxxx> > > > > --- > > I would like to have feedback from the LSM experts to make sure this > > fix is correct. Given the widespread usage of the capable() call, it > > makes me wonder if the patch right. > > > > Secondly, I wasn't able to determine which primary namespace (such as > > mount or IPC, etc.) to consider for the CAP_IPC_LOCK capability. > > (not directly related to this patch, but as concept) > > --- > > drivers/infiniband/core/uverbs_cmd.c | 2 +- > > 1 file changed, 1 insertion(+), 1 deletion(-) > > > > diff --git a/drivers/infiniband/core/uverbs_cmd.c > > b/drivers/infiniband/core/uverbs_cmd.c > > index 5ad14c39d48c..8d6615f390f5 100644 > > --- a/drivers/infiniband/core/uverbs_cmd.c > > +++ b/drivers/infiniband/core/uverbs_cmd.c > > @@ -3198,7 +3198,7 @@ static int ib_uverbs_ex_create_flow(struct > uverbs_attr_bundle *attrs) > > if (cmd.comp_mask) > > return -EINVAL; > > > > - if (!capable(CAP_NET_RAW)) > > + if (!ns_capable(current->nsproxy->net_ns->user_ns, CAP_NET_RAW)) > > return -EPERM; > > > > if (cmd.flow_attr.flags >= IB_FLOW_ATTR_FLAGS_RESERVED) > > -- > > 2.26.2 > >