Re: [PATCH v2 bpf-next 1/9] btf: add kind layout encoding, crcs to UAPI

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 06/07/2023 00:48, Andrii Nakryiko wrote:
> On Mon, Jul 3, 2023 at 6:42 AM Alan Maguire <alan.maguire@xxxxxxxxxx> wrote:
>>
>> On 22/06/2023 23:02, Andrii Nakryiko wrote:
>>> On Fri, Jun 16, 2023 at 10:17 AM Alan Maguire <alan.maguire@xxxxxxxxxx> wrote:
>>>>
>>>> BTF kind layouts provide information to parse BTF kinds.
>>>> By separating parsing BTF from using all the information
>>>> it provides, we allow BTF to encode new features even if
>>>> they cannot be used.  This is helpful in particular for
>>>> cases where newer tools for BTF generation run on an
>>>> older kernel; BTF kinds may be present that the kernel
>>>> cannot yet use, but at least it can parse the BTF
>>>> provided.  Meanwhile userspace tools with newer libbpf
>>>> may be able to use the newer information.
>>>>
>>>> The intent is to support encoding of kind layouts
>>>> optionally so that tools like pahole can add this
>>>> information.  So for each kind we record
>>>>
>>>> - kind-related flags
>>>> - length of singular element following struct btf_type
>>>> - length of each of the btf_vlen() elements following
>>>>
>>>> In addition we make space in the BTF header for
>>>> CRC32s computed over the BTF along with a CRC for
>>>> the base BTF; this allows split BTF to identify
>>>> a mismatch explicitly.
>>>>
>>>> The ideas here were discussed at [1], [2]; hence
>>>>
>>>> Suggested-by: Andrii Nakryiko <andrii@xxxxxxxxxx>
>>>> Signed-off-by: Alan Maguire <alan.maguire@xxxxxxxxxx>
>>>>
>>>> [1] https://lore.kernel.org/bpf/CAEf4BzYjWHRdNNw4B=eOXOs_ONrDwrgX4bn=Nuc1g8JPFC34MA@xxxxxxxxxxxxxx/
>>>> [2] https://lore.kernel.org/bpf/20230531201936.1992188-1-alan.maguire@xxxxxxxxxx/
>>>> ---
>>>>  include/uapi/linux/btf.h       | 24 ++++++++++++++++++++++++
>>>>  tools/include/uapi/linux/btf.h | 24 ++++++++++++++++++++++++
>>>>  2 files changed, 48 insertions(+)
>>>>
>>>> diff --git a/include/uapi/linux/btf.h b/include/uapi/linux/btf.h
>>>> index ec1798b6d3ff..cea9125ed953 100644
>>>> --- a/include/uapi/linux/btf.h
>>>> +++ b/include/uapi/linux/btf.h
>>>> @@ -8,6 +8,22 @@
>>>>  #define BTF_MAGIC      0xeB9F
>>>>  #define BTF_VERSION    1
>>>>
>>>> +/* is this information required? If so it cannot be sanitized safely. */
>>>> +#define BTF_KIND_LAYOUT_OPTIONAL               (1 << 0)
>>>
>>> hm.. I thought we agreed to not have OPTIONAL flag last time, no? From
>>> kernel's perspective nothing is optional. From libbpf perspective
>>> everything should be optional, unless we get type_id reference to
>>> something that we don't recognize. So why the flag and extra code to
>>> handle it?
>>>
>>> We can always add it later, if necessary.
>>>
>>
>> I totally agree we need to reject any BTF that contains references
>> to unknown objects if these references are made via known ones;
>> so for example an enum64 in a struct (in the case we didn't know
>> about an enum64). However, it's possible a BTF kind could point
>> _at_ other BTF kinds but not be pointed _to_ by any known kinds;
>> in such a case wouldn't optional make sense even for the kernel
>> to say "ignore any kinds that we don't know about that aren't
>> participating in any known BTF relationships"? Default assumption
>> without the optional flag would be to reject such BTF.
> 
> I think it's simpler (and would follow what we've been doing with
> kernel-side strict validation of everything) to reject everything
> unknown. "Being pointed to" isn't always contained within BTF itself.
> E.g., for line and func info, type_id comes during BPF_PROG_LOAD. So
> at the point of BTF validation you don't know that some FUNCs will be
> pointed to (as an example). So I'd avoid making unnecessarily
> dangerous assumptions that some pieces of information can be ignored.
> 
> And in general, kernel doesn't trust user-space data without
> validation, so we'd have to double-check all this OPTIONAL flagsole
> somehow anyways.
> 

makes sense. I think I'd been hoping the BTF kind layout would help
address the "newer pahole runs on older kernel" problem, but it
doesn't really. More on that below...

>>
>>>> +
>>>> +/* kind layout section consists of a struct btf_kind_layout for each known
>>>> + * kind at BTF encoding time.
>>>> + */
>>>> +struct btf_kind_layout {
>>>> +       __u16 flags;            /* see BTF_KIND_LAYOUT_* values above */
>>>> +       __u8 info_sz;           /* size of singular element after btf_type */
>>>> +       __u8 elem_sz;           /* size of each of btf_vlen(t) elements */
>>>> +};
>>>> +
>>>> +/* for CRCs for BTF, base BTF to be considered usable, flags must be set. */
>>>> +#define BTF_FLAG_CRC_SET               (1 << 0)
>>>> +#define BTF_FLAG_BASE_CRC_SET          (1 << 1)
>>>> +
>>>>  struct btf_header {
>>>>         __u16   magic;
>>>>         __u8    version;
>>>> @@ -19,8 +35,16 @@ struct btf_header {
>>>>         __u32   type_len;       /* length of type section       */
>>>>         __u32   str_off;        /* offset of string section     */
>>>>         __u32   str_len;        /* length of string section     */
>>>> +       __u32   kind_layout_off;/* offset of kind layout section */
>>>> +       __u32   kind_layout_len;/* length of kind layout section */
>>>> +
>>>> +       __u32   crc;            /* crc of BTF; used if flags set BTF_FLAG_CRC_VALID */
>>>> +       __u32   base_crc;       /* crc of base BTF; used if flags set BTF_FLAG_BASE_CRC_VALID */
>>>>  };
>>>>
>>>> +/* required minimum BTF header length */
>>>> +#define BTF_HEADER_MIN_LEN     (sizeof(struct btf_header) - 16)
>>>
>>> offsetof(struct btf_header, kind_layout_off) ?
>>>
>>> but actually why this needs to be a part of UAPI?
>>>
>>
>> no not really. I was trying to come up with a more elegant
>> way of differentiating between the old and new header formats
>> on the basis of size and eventually just gave up and added
>> a #define. It can absolutely be removed.
> 
> right, so that's why just checking if field is present based on
> btf_header.len and field offset is a good approach? Let's drop
> unnecessary constants from UAPI header
> 
>>
>>>> +
>>>>  /* Max # of type identifier */
>>>>  #define BTF_MAX_TYPE   0x000fffff
>>>>  /* Max offset into the string section */
>>>> diff --git a/tools/include/uapi/linux/btf.h b/tools/include/uapi/linux/btf.h
>>>> index ec1798b6d3ff..cea9125ed953 100644
>>>> --- a/tools/include/uapi/linux/btf.h
>>>> +++ b/tools/include/uapi/linux/btf.h
>>>> @@ -8,6 +8,22 @@
>>>>  #define BTF_MAGIC      0xeB9F
>>>>  #define BTF_VERSION    1
>>>>
>>>> +/* is this information required? If so it cannot be sanitized safely. */
>>>> +#define BTF_KIND_LAYOUT_OPTIONAL               (1 << 0)
>>>> +
>>>> +/* kind layout section consists of a struct btf_kind_layout for each known
>>>> + * kind at BTF encoding time.
>>>> + */
>>>> +struct btf_kind_layout {
>>>> +       __u16 flags;            /* see BTF_KIND_LAYOUT_* values above */
>>>> +       __u8 info_sz;           /* size of singular element after btf_type */
>>>> +       __u8 elem_sz;           /* size of each of btf_vlen(t) elements */
>>>> +};
>>>> +
>>>> +/* for CRCs for BTF, base BTF to be considered usable, flags must be set. */
>>>> +#define BTF_FLAG_CRC_SET               (1 << 0)
>>>> +#define BTF_FLAG_BASE_CRC_SET          (1 << 1)
>>>> +
>>>>  struct btf_header {
>>>>         __u16   magic;
>>>>         __u8    version;
>>>> @@ -19,8 +35,16 @@ struct btf_header {
>>>>         __u32   type_len;       /* length of type section       */
>>>>         __u32   str_off;        /* offset of string section     */
>>>>         __u32   str_len;        /* length of string section     */
>>>> +       __u32   kind_layout_off;/* offset of kind layout section */
>>>> +       __u32   kind_layout_len;/* length of kind layout section */
>>>> +
>>>> +       __u32   crc;            /* crc of BTF; used if flags set BTF_FLAG_CRC_VALID */
>>>
>>> why are we making crc optional? shouldn't we just say that crc is
>>> always filled out?
>>>
>>
>> The approach I took was to have libbpf/pahole be flexible about
>> specification of crcs and kind layout; neither, one of these or both
>> are supported. When neither are specified we'll still generate
>> a larger header, but it will be zeros for the new fields so parseable
>> by older libbpf/kernel. I think we probably need to make it optional
>> for a while to support new pahole on older kernels.
> 
> I'm not sure how this "optional for a while" will turn to
> "non-optional", tbh, and who and when will decide that. I think the
> "new pahole on old kernel" problem is solvable easily by making all
> this new stuff opt-in. New kernel Makefiles will request pahole to
> emit them, if pahole is new enough. Old kernels won't know about this
> feature and even new pahole won't emit it.
>

Another approach would be if we could auto-detect the kinds supported
by an older kernel, and not emit anything > BTF_KIND_MAX for that
kernel. I've put together a proof-of-concept for that, see [1].

> I don't feel too strongly about it, just generally feeling like
> minimizing all the different supportable variations.
> 

Yeah, hopefully some of these options can go away eventually.
The auto-detection scheme in [1], or something like it, might
make things easier in future. Thanks!

Alan

[1]
https://lore.kernel.org/bpf/20230720201443.224040-1-alan.maguire@xxxxxxxxxx/
>>
>>
>>>> +       __u32   base_crc;       /* crc of base BTF; used if flags set BTF_FLAG_BASE_CRC_VALID */
>>>
>>> here it would be nice if we could just rely on zero meaning not set,
>>> but I suspect not everyone will be happy about this, as technically
>>> crc 0 is a valid crc :(
>>>
>>
>> Right, I think we're stuck with the flags unfortunately.
> 
> yep. one extra reason why I like the idea of this being string offset,
> but whatever, small thing
> 
> 
>> Thanks for the review (and apologies for the belated response!)
>>
>> Alan
>>
>>>
>>>>  };
>>>>
>>>> +/* required minimum BTF header length */
>>>> +#define BTF_HEADER_MIN_LEN     (sizeof(struct btf_header) - 16)
>>>> +
>>>>  /* Max # of type identifier */
>>>>  #define BTF_MAX_TYPE   0x000fffff
>>>>  /* Max offset into the string section */
>>>> --
>>>> 2.39.3
>>>>
> 




[Index of Archives]     [Linux Samsung SoC]     [Linux Rockchip SoC]     [Linux Actions SoC]     [Linux for Synopsys ARC Processors]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]


  Powered by Linux