From: Alexander Lobakin <alexandr.lobakin@xxxxxxxxx> Date: Tue, 23 Nov 2021 17:39:29 +0100 Ok, open questions: 1. Channels vs queues vs global. Jakub: no per-channel. David (Ahern): it's worth it to separate as Rx/Tx. Toke is fine with globals at the end I think? My point was that for most of the systems we have 1:1 Rx:Tx (usually num_online_cpus()), so asking drivers separately for the number of RQs and then SQs would end up asking for the same number twice. But the main reason TBH was that most of the drivers store stats on a per-channel basis and I didn't want them to regress in functionality. I'm fine with reporting only netdev-wide if everyone are. In case if we keep per-channel: report per-channel only by request and cumulative globals by default to not flood the output? 2. Count all errors as "drops" vs separately. Daniel: account everything as drops, plus errors should be reported as exceptions for tracing sub. Jesper: we shouldn't mix drops and errors. My point: we shouldn't, that's why there are patches for 2 drivers to give errors a separate counter. I provided an option either to report all errors together ('errors' in stats structure) or to provide individual counters for each of them (sonamed ctrs), but personally prefer detailed errors. However, they might "go detailed" under trace_xdp_exception() only, sound fine (OTOH in RTNL stats we have both "general" errors and detailed error counters). 3. XDP and XSK ctrs separately or not. My PoV is that those are two quite different worlds. However, stats for actions on XSK really make a little sense since 99% of time we have xskmap redirect. So I think it'd be fine to just expand stats structure with xsk_{rx,tx}_{packets,bytes} and count the rest (actions, errors) together with XDP. Rest: - don't create a separate `ip` command and report under `-s`; - save some RTNL skb space by skipping zeroed counters. Also, regarding that I count all on the stack and then add to the storage once in a polling cycle -- most drivers don't do that and just increment the values in the storage directly, but this can be less performant for frequently updated stats (or it's just my embedded past). Re u64 vs u64_stats_t -- the latter is more universal and architecture-friendly, the former is used directly in most of the drivers primarily because those drivers and the corresponding HW are being run on 64-bit systems in the vast majority of cases, and Ethtools stats themselves are not so critical to guard them with anti-tearing. Anyways, local64_t is cheap on ARM64/x86_64 I guess? Thanks, Al