On 9/23/22 21:12, Dmitry Safonov wrote: > Changes from v1: > - Building now with CONFIG_IPV6=n (kernel test robot <lkp@xxxxxxxxx>) > - Added missing static declarations for local functions > (kernel test robot <lkp@xxxxxxxxx>) > - Addressed static analyzer and review comments by Dan Carpenter > (thanks, they were very useful!) > - Fix elif without defined() for !CONFIG_TCP_AO > - Recursively build selftests/net/tcp_ao (Shuah Khan), patches in: > https://lore.kernel.org/all/20220919201958.279545-1-dima@xxxxxxxxxx/T/#u > - Don't leak crypto_pool reference when TCP-MD5 key is modified/changed > - Add TCP-AO support for nettest.c and fcnal-test.sh > (will be used for VRF testing in later versions) > > Version 1: https://lore.kernel.org/all/20220818170005.747015-1-dima@xxxxxxxxxx/T/#u I think it's worth answering the question: why am I continuing sending patches for TCP-AO support when there's already another proposal? [1] Pre-history how we end up with the second approach is here: [2] TLDR; we had a customer and a deadline to deliver, I've given reviews to Leonard, but in the end it seems to me what we got is worth submitting as it's better in my view in many aspects. The biggest differences between two proposals, that I care about (design-decisions, not implementation details): 1. Per-netns TCP-AO keys vs per-socket TCP-AO keys. The reasons why this proposal is using per-socket keys (that are added like TCP-MD5 keys with setsockopt()) are: - They scale better: you don't have to lookup in netns database for a key. This is a major thing for Arista: we have to support customers that want more than 1000 peers with possible multiple keys per-peer. This scales well when the keys are split by design for each socket on every established connection. - TCP-AO doesn't require CAP_NET_ADMIN for usage. - TCP-AO is not meant to be transparent (like ipsec tunnels) for applications. The users are BGP applications which already know what they need. - Leonard's proposal has weird semantics when setsockopt() on some socket will change keys on other sockets in that network namespace. It should have been rather netlink-managed API or something of the kind if the keys are per-netns. 2. This proposal seeks to do less in kernel space and leave more decision-making to the userspace. It is another major disagreement with Leonard's proposal, which seeks to add a key lifetime, key rotation logic and all other business-logic details into the kernel, while here those decisions are left for the userspace. If I understood Leonard correctly, he is placing more things in kernel to simplify migration for user applications from TCP-MD5 to TCP-AO. I rather think that would be a job for a shared library if that's needed. As per my perception (1) was also done to relieve userspace from the job of removing an outdated key simultaneously from all users in netns, while in this proposal this job is left for userspace to use available IPC methods. Essentially, I think TCP-AO in kernel should do only minimum that can't be done "reasonably" in userspace. By "reasonably" I mean without moving the TCP-IP stack into userspace. 3. Re-using existing kernel code vs copy'n'pasting it, leaving refactoring for later. I'm a big fan of reusing existing functions. I think lesser amount of code in the end reduces the burden of maintenance as well as simplifies the code (both reading and changing). I can see Leonard's point of simplifying backports to stable releases that he ships to customers, but I think any upstream proposal should add less code and try reusing more. 4. Following RFC5925 closer to text. RFC says that rnext_key from the peer MUST be respected, as well as that current_key MUST not be removed on an established connection. In this proposal if the requirements of RFC can be met, they are followed, rather than broken. 5. Using ahash instead of shash. If there's a hardware accelerator - why not using it? This proposal uses crypto_ahash through per-CPU pool of crypto requests (crypto_pool). 6. Hash algorithm UAPI: magic constants vs hash name as char *. This is a thing I've asked Leonard multiple times and what he refuses to change in his patches: let the UAPI have `char tcpa_alg_name[64]' and just pass it to crypto_* layer. There is no need for #define MY_HASHING_ALGO 0x2 and another in-kernel array to convert the magic number to algorithm string in order to pass it to crypto. The algorithm names are flexible: we already have customer's request to use other than RFC5926 required hashing algorithms. And I don't see any value in this middle-layer. This is already what kernel does, see for example, include/uapi/linux/xfrm.h, grep for alg_name. 7. Adding traffic keys from the beginning. The proposal would be incomplete without having traffic keys: they are pre-calculated in this proposal, so the TCP stack doesn't have to do hashing twice (first for calculation of the traffic key) for every segment on established connections. This proposal has them naturally per-socket. I think those are the biggest differences in the approaches and they are enough to submit a concurrent proposal. Salam, Francesco, please add if I've missed any other disagreement or major architectural/design difference in the proposals. > In TODO (expect in next versions): > - selftests on older kernels (or with CONFIG_TCP_AO=n) should exit with > SKIP, not FAIL > - Support VRFs in setsockopt() > - setsockopt() UAPI padding + a test that structures are of the same > size on 32-bit as on 64-bit platforms > - IPv4-mapped-IPv6 addresses (might be working, but no selftest now) > - Remove CONFIG_TCP_AO dependency on CONFIG_TCP_MD5SIG > - Add TCP-AO static key > - Measure/benchmark TCP-AO and regular TCP connections > - setsockopt(TCP_REPAIR) with TCP-AO [..] [1]: https://lore.kernel.org/linux-crypto/cover.1662361354.git.cdleonard@xxxxxxxxx/ [2]: https://lore.kernel.org/all/8097c38e-e88e-66ad-74d3-2f4a9e3734f4@xxxxxxxxxx/T/#u Thanks, Dmitry