Re: [RFC PATCH 0/2] sctp: add new getsockopt option SCTP_SOCKOPT_PEELOFF_KERNEL

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, Jun 17, 2015 at 08:38:10AM -0300, Marcelo Ricardo Leitner wrote:
> On 17-06-2015 07:21, Neil Horman wrote:
> >On Tue, Jun 16, 2015 at 07:42:31PM -0300, Marcelo Ricardo Leitner wrote:
> >>Hi,
> >>
> >>I'm trying to remove a direct dependency of dlm module on sctp one.
> >>Currently dlm code is calling sctp_do_peeloff() directly and only this
> >>call is causing the load of sctp module together with dlm. For that, we
> >>have basically 3 options:
> >>- Doing a module split on dlm
> >>   - which I'm avoiding because it was already split and was merged (more
> >>     info on patch2 changelog)
> >>   - and the sctp code on it is rather small if compared with sctp module
> >>     itself
> >>- Using some other infra that gets indirectly activated, like getsockopt()
> >>   - It was like this before, but the exposed sockopt created a file
> >>     descriptor for the new socket and that create some serious issues.
> >>     More info on 2f2d76cc3e93 ("dlm: Do not allocate a fd for peeloff")
> >>- Doing something like ipv6_stub (which is used by vxlan) or similar
> >>   - but I don't feel that's a good way out here, it doesn't feel right.
> >>
> >>So I'm approaching this by going with 2nd option again but this time
> >>also creating a new sockopt that is only accessible for kernel users of
> >>this protocol, so that we are safe to directly return a struct socket *
> >>via getsockopt() results. This is the tricky part of it of this series.
> >>
> >>It smells hacky yes but currently most of sctp calls are wrapped behind
> >>kernel_*(). Even if we set a flag (like netlink does) saying that this
> >>is a kernel socket, we still have the issue of getting the function call
> >>through and returning such non-usual return value.
> >>
> >>I kept __user marker on sctp_getsockopt_peeloff_kernel() prototype and
> >>its helpers just to avoid issues with static checkers.
> >>
> >>Kernel path not really tested yet.. mainly willing to know what do you
> >>think, is this feasible? getsockopt option only reachable by kernel
> >>itself? Couldn't find any other like this.
> >>
> >>Thanks,
> >>Marcelo
> >>
> >>Marcelo Ricardo Leitner (2):
> >>   sctp: add new getsockopt option SCTP_SOCKOPT_PEELOFF_KERNEL
> >>   dlm: avoid using sctp_do_peeloff directly
> >>
> >>  fs/dlm/lowcomms.c         | 17 ++++++++---------
> >>  include/uapi/linux/sctp.h | 12 ++++++++++++
> >>  net/sctp/socket.c         | 39 +++++++++++++++++++++++++++++++++++++++
> >>  3 files changed, 59 insertions(+), 9 deletions(-)
> >>
> >>--
> >>2.4.1
> >>
> >>
> >
> >Why not just use the existing PEELOFF socket option with the kernel_getsockopt
> >interface, and sockfd_lookup to translate the returned value back to a socket
> >struct?  That seems less redundant and less hack-ish to me.
> 
> It was like that before commit 2f2d76cc3e93 ("dlm: Do not allocate a fd for
> peeloff"), but it caused serious issues due to the fd allocation, so that's
> what I'm willing to avoid now.
> 
> References:
> http://article.gmane.org/gmane.linux.network.drbd/22529
> https://bugzilla.redhat.com/show_bug.cgi?id=1075629 (this one is closed,
> sorry)
> 
>   Marcelo
> 
Ah, I see.  You're using the new socket option as a differentiator to just skip
the creation of an FD.

I get your reasoning, but I'm still not in love with the idea of duplicating
code paths to avoid that action.  Can we use some data inside the socket
structure to do this differentiation?  Specifically here I'm thinking of
sock->file.  IIRC that will be non-null for any sockets created in user space,
but will always be NULL for dlm created sockets (since we use sock_create
directly to create them.  If that is a sufficient differentiator, then we can
just optionally allocate the new socket fd for the peeled off socket, iff the
parent sock->file pointer is non-null.

Thoughts?
Neil

--
To unsubscribe from this list: send the line "unsubscribe linux-sctp" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [Linux Networking Development]     [Linux OMAP]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux