Hi D.Wythe, On 07/06/2024 07:09, D. Wythe wrote: > > On 6/7/24 5:22 AM, Mat Martineau wrote: >> On Wed, 5 Jun 2024, D. Wythe wrote: >> >>> From: "D. Wythe" <alibuda@xxxxxxxxxxxxxxxxx> >>> >>> This patch allows to create smc socket via AF_INET, >>> similar to the following code, >>> >>> /* create v4 smc sock */ >>> v4 = socket(AF_INET, SOCK_STREAM, IPPROTO_SMC); >>> >>> /* create v6 smc sock */ >>> v6 = socket(AF_INET6, SOCK_STREAM, IPPROTO_SMC); >>> >>> There are several reasons why we believe it is appropriate here: >>> >>> 1. For smc sockets, it actually use IPv4 (AF-INET) or IPv6 (AF-INET6) >>> address. There is no AF_SMC address at all. >>> >>> 2. Create smc socket in the AF_INET(6) path, which allows us to reuse >>> the infrastructure of AF_INET(6) path, such as common ebpf hooks. >>> Otherwise, smc have to implement it again in AF_SMC path. >>> >>> Signed-off-by: D. Wythe <alibuda@xxxxxxxxxxxxxxxxx> >>> Tested-by: Niklas Schnelle <schnelle@xxxxxxxxxxxxx> >>> --- >>> include/uapi/linux/in.h | 2 + >>> net/smc/Makefile | 2 +- >>> net/smc/af_smc.c | 16 ++++- >>> net/smc/smc_inet.c | 169 +++++++++++++++++++++++++++++++++++++++ >>> +++++++++ >>> net/smc/smc_inet.h | 22 +++++++ >>> 5 files changed, 208 insertions(+), 3 deletions(-) >>> create mode 100644 net/smc/smc_inet.c >>> create mode 100644 net/smc/smc_inet.h >>> >>> diff --git a/include/uapi/linux/in.h b/include/uapi/linux/in.h >>> index e682ab6..0c6322b 100644 >>> --- a/include/uapi/linux/in.h >>> +++ b/include/uapi/linux/in.h >>> @@ -83,6 +83,8 @@ enum { >>> #define IPPROTO_RAW IPPROTO_RAW >>> IPPROTO_MPTCP = 262, /* Multipath TCP connection */ >>> #define IPPROTO_MPTCP IPPROTO_MPTCP >>> + IPPROTO_SMC = 263, /* Shared Memory Communications */ >>> +#define IPPROTO_SMC IPPROTO_SMC >> >> Hello, >> >> It's not required to assign IPPROTO_MPTCP+1 as your new IPPROTO_SMC >> value. Making IPPROTO_MAX larger does increase the size of the >> inet_diag_table. Values from 256 to 261 are usable for IPPROTO_SMC >> without increasing IPPROTO_MAX. >> >> Just for background: When we added IPPROTO_MPTCP, we chose 262 because >> it is IPPROTO_TCP+0x100. The IANA reserved protocol numbers are 8 bits >> wide so we knew we would not conflict with any future additions, and >> in the case of MPTCP is was convenient that truncating the proto value >> to 8 bits would match IPPROTO_TCP. >> >> - Mat >> > > Hi Mat, > > Thank you very much for your feedback, I have always been curious about > the origins of IPPROTO_MPTCP and I am glad to > have learned new knowledge. > > Regarding the size issue of inet_diag_tables, what you said does make > sense. However, we still hope to continue using 263, > although the rationale may not be fully sufficient, as this series has > been under community evaluation for quite some time now, > and we haven't received any feedback about this value, so we’ve been > using it in some user-space tools ... 🙁 > > I would like to see what the community thinks. If everyone agrees that > using 263 will be completely unacceptable and a disaster, > then we will have no choice but to change it. It will not be a disaster, but a small waste of space (even if CONFIG_SMC is not set). Also, please note that the introduction of IPPROTO_MPTCP caused some troubles in some userspace programs. That was mainly because IPPROTO_MAX got updated, and they didn't expect that, e.g. a quick search on GitHub gave me this: https://github.com/systemd/systemd/issues/15604 https://github.com/strace/strace/issues/164 https://github.com/rust-lang/libc/issues/1896 I guess these userspace programs should now be ready for a new update, but still, it might be better to avoid that if there is a "simple" solution. I understand changing your userspace tools will be annoying. (On the other hand, it is still time to do that :) ) Cheers, Matt -- Sponsored by the NGI0 Core fund.