On Fri, Jan 22, 2021 at 8:31 AM Alan Maguire <alan.maguire@xxxxxxxxxx> wrote: > > On Thu, 21 Jan 2021, Andrii Nakryiko wrote: > > > On Wed, Jan 20, 2021 at 10:56 PM Andrii Nakryiko > > <andrii.nakryiko@xxxxxxxxx> wrote: > > > > > > On Sun, Jan 17, 2021 at 2:22 PM Alan Maguire <alan.maguire@xxxxxxxxxx> wrote: > > > > > > > > Add a BTF dumper for typed data, so that the user can dump a typed > > > > version of the data provided. > > > > > > > > The API is > > > > > > > > int btf_dump__emit_type_data(struct btf_dump *d, __u32 id, > > > > const struct btf_dump_emit_type_data_opts *opts, > > > > void *data); > > > > > > > > Two more things I realized about this API overnight: > > > > 1. It's error-prone to specify only the pointer to data without > > specifying the size. If user screws up and scecifies wrong type ID or > > if BTF data is corrupted, then this API would start reading and > > printing memory outside the bounds. I think it's much better to also > > require user to specify the size and bail out with error if we reach > > the end of the allowed memory area. > > Yep, good point, especially given in the tracing context we will likely > only have a subset of the data (e.g. part of the 16k representing a > task_struct). The way I was approaching this was to return -E2BIG > and append a "..." to the dumped data denoting the data provided > didn't cover the size needed to fully represent the type. The idea is > the structure is too big for the data provided, hence E2BIG, but maybe > there's a more intuitive way to do this? See below for more... > Hm... that's an interesting use case for sure, but seems reasonable to support. "..." seems a bit misleading because it can be interpreted as "we omitted some output for brevity", no? "<truncated>" or something like that might be more obvious, but I'm just bikeshedding :) > > > > 2. This API would be more useful if it also returns the amount of > > "consumed" bytes. That way users can do more flexible and powerful > > pretty-printing of raw data. So on success we'll have >= 0 number of > > bytes used for dumping given BTF type, or <0 on error. WDYT? > > > > I like it! So > > 1. if a user provides a too-big data object, we return the amount we used; and > 2. if a user provides a too-small data object, we append "..." to the dump > and return -E2BIG (or whatever error code). > > However I wonder for case 2 if it'd be better to use a snprintf()-like > semantic rather than an error code, returning the amount we would have > used. That way we easily detect case 1 (size passed in > return value), > case 2 (size passed in < return value), and errors can be treated separately. > Feels to me that dealing with truncated data is going to be sufficiently > frequent it might be good not to classify it as an error. Let me know if > you think that makes sense. Hm... Yeah, that would work, I think, and would feel pretty natural. On the other hand, it's easy to know the total input size needed by calling btf__resolve_size(btf, type_id), so if user expects to provide truncated input data and wants to know how much they should have provided, they can easily do that. Basically, I don't have strong preference here, though providing truncated input data still feels more like an error, than a normal situation... Maybe someone else want to weigh in? And -E2BIG is distinctive enough in this case. So both would work fine, but not clear which one is less surprising API. > > I'm working on v3, and hope to have something early next week, but a quick > reply to a question below... > > > > > ...where the id is the BTF id of the data pointed to by the "void *" > > > > argument; for example the BTF id of "struct sk_buff" for a > > > > "struct skb *" data pointer. Options supported are > > > > > > > > - a starting indent level (indent_lvl) > > > > - a set of boolean options to control dump display, similar to those > > > > used for BPF helper bpf_snprintf_btf(). Options are > > > > - compact : omit newlines and other indentation > > > > - noname: omit member names > > > > - zero: show zero-value members > > > > > > > > Default output format is identical to that dumped by bpf_snprintf_btf(), > > > > for example a "struct sk_buff" representation would look like this: > > > > > > > > struct sk_buff){ > > > > (union){ > > > > (struct){ > > > > > > Curious, these explicit anonymous (union) and (struct), is that > > > preferred way for explicitness, or is it just because it makes > > > implementation simpler and thus was chosen? I.e., if the goal was to > > > mimic C-style data initialization, you'd just have plain .next = ..., > > > .prev = ..., .dev = ..., .dev_scratch = ..., all on the same level. So > > > just checking for myself. > > The idea here is that we want to clarify if we're dealing with > an anonymous struct or union. I wanted to have things work > like a C-style initializer as closely as possible, but I > realized it's not legit to initialize multiple values in a > union, and more importantly when we're trying to visually interpret > data, we really want to know if an anonymous container of data is > a structure (where all values represent different elements in the > structure) or a union (where we're seeing multiple interpretations of > the same value). Yeah, fair enough. > > Thanks again for the detailed review! Of course. But it's not clear if you agree with me on everything, so I still hope to get replies later. > > Alan