>> True. But keep also in mind the scope of IOAM which is not to be >> deployed widely on the Internet. It is deployed on limited (aka private) >> domains where each node is therefore managed by the operator. So, I'm >> not really sure why you think that the implementation specific thing is >> a problem here. Context of "unit" is provided by the IOAM Namespace-ID >> attached to the trace, as well as each Node-ID if included. Again, it's >> up to the operator to interpret values accordingly, depending on each >> node (i.e., the operator has a large and detailed view of his domain; he >> knows if the buffer occupancy value "X" is abnormal or not for a >> specific node, he knows which unit is used for a specific node, etc). > > It's quite likely I'm missing the point. Let me try again to make it all clear on your mind. Here are some quoted paragraphs from the spec: "Generic data: Format-free information where syntax and semantic of the information is defined by the operator in a specific deployment. For a specific IOAM-Namespace, all IOAM nodes have to interpret the generic data the same way. Examples for generic IOAM data include geo-location information (location of the node at the time the packet was processed), buffer queue fill level or cache fill level at the time the packet was processed, or even a battery charge level." This one basically says that the IOAM Namespace-ID (in the IOAM Trace Option-Type header) is responsible for providing context to data fields (i.e., for "units" too, in case of generic fields such as queue depth or buffer occupancy). So it's up to the operator to gather similar nodes within a same IOAM Namespace. And, even if "different" kind of nodes are within an IOAM Namespace, you still have a fallback solution if Node IDs are part of the trace (the "hop-lim & node-id" data field, bit 0 in the trace type). Indeed, the operator (or the collector/interpretor) knows if node A uses "bytes" or any other units for a generic data field. "It should be noted that the semantics of some of the node data fields that are defined below, such as the queue depth and buffer occupancy, are implementation specific. This approach is intended to allow IOAM nodes with various different architectures." The last sentence is important here and is, in fact, related to what you describe below. Having genericity on units for such data fields allows for supporting multiple architectures. Same idea for the following paragraph: "Data fields and associated data types for each of the IOAM-Data- Fields are specified in the following sections. The definition of IOAM-Data-Fields focuses on the syntax of the data-fields and avoids specifying the semantics where feasible. This is why no units are defined for data-fields like e.g., "buffer occupancy" or "queue depth". With this approach, nodes can supply the information in their native format and are not required to perform unit or format conversions. Systems that further process IOAM information, like e.g., a network management system are assumed to also handle unit conversions as part of their IOAM data-fields processing. The combination of a particular data-field and the namespace-id provides for the context to interpret the provided data appropriately." Does it make more sense now on why it's not really a problem to have implementation specific units for such data fields? >> [...] >> >> Do you believe this patch does not provide what is defined in the spec? >> If so, I'm open to any suggestions. > > The opposite, in a sense. I think the patch does implement behavior > within a reasonable interpretation of the standard. But the feature > itself seems more useful for forwarding ASICs than Linux routers, Good point. Actually, it's probably why such data field was defined as it is. > because Linux routers can run a full telemetry stack and all sort > of advanced SW instrumentation. The use case for reporting kernel > memory use via IOAM's constrained interface does not seem particularly > practical since it's not providing a very strong signal on what's > going on. I agree and disagree. I disagree because this value definitely tells you that something (potentially bad) is going on, when it increases significantly enough to reach a critical threshold. Basically, we need more skb's, but oh, the pool is exhausted. OK, not a problem, expand the pool. Oh wait, no memory left. Why? Is it only due to too much (temporary?) load? Should I put the blame on the NIC? Is it a memory issue? Is it something else? Or maybe several issues combined? Well, you might not know exactly why (though you know there is a problem), which is also why I agree with you. But, this is also why you have other data fields available (i.e., detecting a problem might require 2+ symptoms instead of just one). > For switches running Linux the switch ASIC buffer occupancy can be read > via devlink-sb that'd seem like a better fit for me, but unfortunately > the devlink calls can sleep so we can't read such device info from the > datapath. Indeed, would be a better fit. I didn't know about this one, thanks for that. It's a shame it can't be used in this context, though. But, at the end of the day, we're left with nothing regarding buffer occupancy. So I'm wondering if "something" is not better than "nothing" in this case. And, for that, we're back to my previous answer on why I agree and disagree with what you said about its utility.