On Dec 21, 2021, at 9:13 PM, Jakub Kicinski kuba@xxxxxxxxxx wrote: > On Tue, 21 Dec 2021 19:23:37 +0200 Vladimir Oltean wrote: >> On Tue, Dec 21, 2021 at 06:06:39PM +0100, Justin Iurman wrote: >> > On Dec 10, 2021, at 1:38 AM, Jakub Kicinski kuba@xxxxxxxxxx wrote: >> > > I think we're on the same page, the main problem is I've not seen >> > > anyone use the skbuff_head_cache occupancy as a signal in practice. >> > > >> > > I'm adding a bunch of people to the CC list, hopefully someone has >> > > an opinion one way or the other. >> > >> > It looks like we won't have more opinions on that, unfortunately. >> > >> > @Jakub - Should I submit it as a PATCH and see if we receive more >> > feedback there? >> >> I know nothing about OAM and therefore did not want to comment, but I >> think the point raised about the metric you propose being irrelevant in >> the context of offloaded data paths is quite important. The "devlink-sb" >> proposal was dismissed very quickly on grounds of requiring sleepable >> context, is that a deal breaker, and if it is, why? Not only offloaded >> interfaces like switches/routers can report buffer occupancy. Plain NICs >> also have buffer pools, DMA RX/TX rings, MAC FIFOs, etc, that could >> indicate congestion or otherwise high load. Maybe slab information could >> be relevant, for lack of a better option, on virtual interfaces, but if >> they're physical, why limit ourselves on reporting that? The IETF draft >> you present says "This field indicates the current status of the >> occupancy of the common buffer pool used by a set of queues." It appears >> to me that we could try to get a reporting that has better granularity >> (per interface, per queue) than just something based on >> skbuff_head_cache. What if someone will need that finer granularity in >> the future. > > Indeed. > > In my experience finding meaningful metrics is heard, the chances that > something that seems useful on the surface actually provides meaningful > signal in deployments is a lot lower than one may expect. And the True. > commit message reads as if the objective was checking a box in the > implemented IOAM metrics, rather exporting relevant information. Indeed, but not only. I sent this patchset as a Request for Comments to see if it was correct and relevant. I mean, if there is no consensus on this, I'll keep this data field as not supported, not a big deal. But it would obviously be good to have it at some point (as long as what we retrieve makes sense enough, and for all cases). > We can do a roll call on people CCed but I read their silence as nobody I thought that silence means consent. That's why more opinions would be welcome, even if we seem to converge. Not only for opinions, but also for any idea or guidance for a better solution, if any. > thinks this metric is useful. Is there any experimental data you can > point to which proves the signal strength? Apart from the fact that I monitored the metric during normal and high-load situations, honestly no. Values made sense during those tests, though.