On Thu, 2023-07-13 at 16:09 -0700, Stanislav Fomichev wrote: > On 07/13, Alexei Starovoitov wrote: > > imo all 3 options including this 4th one are too hacky. > > I understand ld_preload limitations and desire to have it per-cgroup, > > but messing this much with user space feels a little bit too much. > > What side effects will it cause? > > Maybe all that is really needed is some new per-netns sysctl to automatically > upgrade from IPPROTO_TCP to IPPROTO_MPTCP? Or is it too broad of a > brush? I think it would be actually too broad, see below... > I've also CC'd netdev for visibility... > > > Meaning is this enough to just change the proto? > > Nothing in user space later on needs to be aware the protocol is so different? > > IIUC, if you use IPPROTO_MPTCP, you just get regular TCP until you start > adding extra routes (via netlink). That's why their current > unconditional IPPROTO_TCP->IPPROTO_MPTCP rewrite via ld_preload also somewhat > works. FTR, it the other way around: when using IPPROTO_MPTCP you always get MPTCP protocol handshake that downgrade gracefully to TCP if the peer does not support it. Then multiple paths can be added/enabled by different means, but that is another matter - a quite orthogonal one. The transition to TCP in currently not completely for free: active (client) MPTCP sockets fallen-back to TCP will keep some overhead vs plain TCP ones. Being able to control the IPPROTO_TCP->IPPROTO_MPTCP change on per socket basis do offer some advantages e.g. constraining the change to the sockets that are likely to complete successfully the MPTCP handshake. > > I feel the consequences are too drastic to support such thing > > through an official/stable hook. > > We can consider an fmod_ret unstable hook somewhere in the kernel > > that bpf prog can attach to and tweak the ret value and/or args, > > but the production environment won't be using it. > > It will be a temporary gap until user space is properly converted to mptcp. > > Asking every app to do s/IPPROTO_TCP/IPPROTO_MPTCP/ might be annoying > though? (don't have a horse in this race, but have some v4->v6 migration > vibes from this) I can do only wild guesses, but I also expect such "transition" to be extremely long and/or incomplete. Cheers, Paolo