2023-08-23, 16:22:31 -0400, Scott Dial wrote: > > 2023-08-18, 18:46:48 -0700, Jakub Kicinski wrote: > > > Can we not fix the ordering problem? > > > Queue the packets locally if they get out of order? > > AES-NI's implementation of gcm(aes) requires the FPU, so if it's busy the > decrypt gets stuck on the cryptd queue, but that queue is not > order-preserving. It should be (per CPU [*]). The queue itself is a linked list, and if we have requests on the queue we don't let new requests skip the queue. [*] and if you have packets coming through multiple CPUs at the same time, ordering won't be predictable anyway > I would emphasize > that benchmarking of network performance should be done by looking at more > than just the interface frame rate. For instance, out-of-order deliver of > packets can trigger TCP backoff. I was never interested in how many packets > the macsec driver could stuff onto the wire, because the impact was my TCP > socket stalling and my UDP streams being garbled. Sure. And for iperf3/TCP tests, I'm seeing much better performance out of async crypto (or much lower CPU utilization for the same throughput on UDP tests), even with the FPU busy. I decided to go the sysctl route instead of reverting because I couldn't figure out how to reproduce the problems you've hit, but I didn't want to just bring them back for your setup. > On 8/22/2023 11:39 AM, Sabrina Dubroca wrote: > > Actually, looking into the crypto API side, I don't see how they can > > get out of order since commit 81760ea6a95a ("crypto: cryptd - Add > > helpers to check whether a tfm is queued"): > > > > [...] ensure that no reordering is introduced because of requests > > queued in cryptd with respect to requests being processed in > > softirq context. > > > > And cryptd_aead_queued() is used by AESNI (via simd_aead_decrypt()) to > > decide whether to process the request synchronously or not. > > I have not been following linux-crypto changes, but I would be surprised if > request is not flagged with CRYPTO_TFM_REQ_MAY_BACKLOG, so it would be macsec doesn't use CRYPTO_TFM_REQ_MAY_BACKLOG. > queue. If that's not the case, then the attempt to decrypt would return > -EBUSY, which would translate to a packet error, since macsec_decrypt MUST > handle the skb during the softirq. If we get more packets than we can process, we drop them. I think that's fine. > > So I really don't get what commit ab046a5d4be4 was trying to fix. I've > > never been able to reproduce that issue, I guess commit 81760ea6a95a > > explains why. > > > > I'd suggest to revert commit ab046a5d4be4, but it feels wrong to > > revert it without really understanding what problem Scott hit and why > > 81760ea6a95a didn't solve it. > > I don't think that commit has any relevance to the issue. It maintains the ordering of requests. If there are async requests currently waiting to be processed, we don't let requests bypass the queue until we've drained it. To make sure, I ran some tests with numbered messages and a patched kernel that forces queueing decryption every couple of requests, and I didn't see any reordering. -- Sabrina