> On May 28, 2019, at 8:08 AM, Willem de Bruijn <willemdebruijn.kernel@xxxxxxxxx> wrote: > I will push up latest patches soon. I did some testing and discovered that only TCP audit tests failed. They failed much less often when enabling poll. Once in about 20 runs still failed. Therefore I commented out the TCP audit tests. As for the other tests, this is what I got with poll() disabled… udp gso zerocopy timestamp audit udp rx: 1611 MB/s 1148129 calls/s udp tx: 1659 MB/s 28146 calls/s 28146 msg/s udp rx: 1686 MB/s 1201494 calls/s udp tx: 1685 MB/s 28579 calls/s 28579 msg/s udp rx: 1685 MB/s 1200402 calls/s udp tx: 1683 MB/s 28552 calls/s 28552 msg/s Summary over 3.000 seconds... sum udp tx: 1716 MB/s 85277 calls (28425/s) 85277 msgs (28425/s) Tx Timestamps: 85277 received 0 errors Zerocopy acks: 85277 received 0 errors Here you see that with poll() enabled, it is a bit slower, so I don’t have it enabled in udpgso_bench.sh … udp gso zerocopy timestamp audit udp rx: 1591 MB/s 1133945 calls/s udp tx: 1613 MB/s 27358 calls/s 27358 msg/s udp rx: 1644 MB/s 1171674 calls/s udp tx: 1643 MB/s 27869 calls/s 27869 msg/s udp rx: 1643 MB/s 1170666 calls/s udp tx: 1641 MB/s 27845 calls/s 27845 msg/s Summary over 3.000 seconds... sum udp tx: 1671 MB/s 83072 calls (27690/s) 83072 msgs (27690/s) Tx Timestamps: 83072 received 0 errors Zerocopy acks: 83072 received 0 errors You may be interested that I reduced test lengths from 4 to 3 seconds, but I am still getting 3 reports per test. I picked up the extra report by changing 'if (tnow > treport)’ to 'if (tnow >= treport)’ > The only issue specific to GSO is that xmit_more can forego this > doorbell until the last segment. We want to complicate this logic with > a special case based on tx_flags. A process that cares should either > not use GSO, or the timestamp should be associated with the last > segment as I've been arguing so far. This is the area I was thinking of looking into. I’m not sure it will work or that it will be too messy. It may be worth a little bit of digging to see if there is anything there. That will be down the road a bu >> >> I’ll get back to you when I have tested this more thoroughly. Early results >> suggest that adding the -P poll() option has fixed it without any appreciable >> performance hit. I’ll share raw results with you, and we can make a final >> decision together. > > In the main loop? It still is peculiar that notifications appear to go > missing unless the process blocks waiting for them. Nothing in > sock_zerocopy_callback or the queueing onto the error queue should > cause drops, as far as I know. > Now that I know the issue is only in TCP, I can speculate that all bytes are being reported, but done with fewer messages. It may warrant some investigation in case there is some kind of bug. > Indeed. Ideally even run all tests, but return error if any failed, > like this recent patch > > selftests/bpf: fail test_tunnel.sh if subtests fail > https://patchwork.ozlabs.org/patch/1105221/ > > but that may be a lot of code churn and better left to a separate patch. I like it. I have it coded up, and it seems to work well. I’ll make a separate commit in the patch set so we can yank it out if you feel it is too much