Re: [PATCH v2 bpf-next 1/2] libbpf: Support triple-underscore flavors for kfunc relocation

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 8/16/23 1:37 PM, David Vernet wrote:
> On Wed, Aug 16, 2023 at 09:58:12AM -0700, Dave Marchevsky wrote:
>> The function signature of kfuncs can change at any time due to their
>> intentional lack of stability guarantees. As kfuncs become more widely
>> used, BPF program writers will need facilities to support calling
>> different versions of a kfunc from a single BPF object. Consider this
>> simplified example based on a real scenario we ran into at Meta:
>>
>>   /* initial kfunc signature */
>>   int some_kfunc(void *ptr)
>>
>>   /* Oops, we need to add some flag to modify behavior. No problem,
>>     change the kfunc. flags = 0 retains original behavior */
>>   int some_kfunc(void *ptr, long flags)
>>
>> If the initial version of the kfunc is deployed on some portion of the
>> fleet and the new version on the rest, a fleetwide service that uses
>> some_kfunc will currently need to load different BPF programs depending
>> on which some_kfunc is available.
>>
>> Luckily CO-RE provides a facility to solve a very similar problem,
>> struct definition changes, by allowing program writers to declare
>> my_struct___old and my_struct___new, with ___suffix being considered a
>> 'flavor' of the non-suffixed name and being ignored by
>> bpf_core_type_exists and similar calls.
>>
>> This patch extends the 'flavor' facility to the kfunc extern
>> relocation process. BPF program writers can now declare
>>
>>   extern int some_kfunc___old(void *ptr)
>>   extern int some_kfunc___new(void *ptr, int flags)
>>
>> then test which version of the kfunc exists with bpf_ksym_exists.
>> Relocation and verifier's dead code elimination will work in concert as
>> expected, allowing this pattern:
>>
>>   if (bpf_ksym_exists(some_kfunc___old))
>>     some_kfunc___old(ptr);
>>   else
>>     some_kfunc___new(ptr, 0);
>>
>> Changelog:
>>
>> v1 -> v2: https://lore.kernel.org/bpf/20230811201346.3240403-1-davemarchevsky@xxxxxx/
>>   * No need to check obj->externs[i].essent_name before zfree (Jiri)
>>   * Use strndup instead of replicating same functionality (Jiri)
>>   * Properly handle memory allocation falure (Stanislav)
>>
>> Signed-off-by: Dave Marchevsky <davemarchevsky@xxxxxx>
>> ---
>>  tools/lib/bpf/libbpf.c | 20 +++++++++++++++++++-
>>  1 file changed, 19 insertions(+), 1 deletion(-)
>>
>> diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c
>> index b14a4376a86e..8899abc04b8c 100644
>> --- a/tools/lib/bpf/libbpf.c
>> +++ b/tools/lib/bpf/libbpf.c
>> @@ -550,6 +550,7 @@ struct extern_desc {
>>  	int btf_id;
>>  	int sec_btf_id;
>>  	const char *name;
>> +	char *essent_name;
>>  	bool is_set;
>>  	bool is_weak;
>>  	union {
>> @@ -3770,6 +3771,7 @@ static int bpf_object__collect_externs(struct bpf_object *obj)
>>  	struct extern_desc *ext;
>>  	int i, n, off, dummy_var_btf_id;
>>  	const char *ext_name, *sec_name;
>> +	size_t ext_essent_len;
>>  	Elf_Scn *scn;
>>  	Elf64_Shdr *sh;
>>  
>> @@ -3819,6 +3821,14 @@ static int bpf_object__collect_externs(struct bpf_object *obj)
>>  		ext->sym_idx = i;
>>  		ext->is_weak = ELF64_ST_BIND(sym->st_info) == STB_WEAK;
>>  
>> +		ext_essent_len = bpf_core_essential_name_len(ext->name);
>> +		ext->essent_name = NULL;
>> +		if (ext_essent_len != strlen(ext->name)) {
>> +			ext->essent_name = strndup(ext->name, ext_essent_len);
>> +			if (!ext->essent_name)
>> +				return -ENOMEM;
>> +		}
>> +
>>  		ext->sec_btf_id = find_extern_sec_btf_id(obj->btf, ext->btf_id);
>>  		if (ext->sec_btf_id <= 0) {
>>  			pr_warn("failed to find BTF for extern '%s' [%d] section: %d\n",
>> @@ -7624,7 +7634,8 @@ static int bpf_object__resolve_ksym_func_btf_id(struct bpf_object *obj,
>>  
>>  	local_func_proto_id = ext->ksym.type_id;
>>  
>> -	kfunc_id = find_ksym_btf_id(obj, ext->name, BTF_KIND_FUNC, &kern_btf, &mod_btf);
>> +	kfunc_id = find_ksym_btf_id(obj, ext->essent_name ?: ext->name, BTF_KIND_FUNC, &kern_btf,
>> +				    &mod_btf);
>>  	if (kfunc_id < 0) {
>>  		if (kfunc_id == -ESRCH && ext->is_weak)
>>  			return 0;
>> @@ -7642,6 +7653,9 @@ static int bpf_object__resolve_ksym_func_btf_id(struct bpf_object *obj,
>>  		pr_warn("extern (func ksym) '%s': func_proto [%d] incompatible with %s [%d]\n",
>>  			ext->name, local_func_proto_id,
> 
> Should we do ext->essent_name ?: ext->name here or in the below pr's as
> well? Hmm, maybe it would be more clear to keep the full name.
> 

Yeah, I agree that the full name should be used in this warning for clarity.
So won't change.

>>  			mod_btf ? mod_btf->name : "vmlinux", kfunc_proto_id);
>> +
>> +		if (ext->is_weak)
>> +			return 0;
> 
> Could you clarify why we want this check? Don't we want to fail if the
> prototype of the actual (essent) symbol we resolve to doesn't match
> what's in the BPF prog? If we do want to keep this, should we do the
> check above the pr_warn()?
> 

Actually this if-and-return was initially above the pr_warn while I was
developing the patch. I moved it down here to confirm via './test_progs -vv'
that the pseudo-failure cases in the selftests were going down the codepaths
I expected, and left it b/c better to err on the side of too much logging
when doing this ___flavor trickery.

In re: "clarify why we want this check?" and subsequent question, IIUC, with an
extern decl like

  struct task_struct *bpf_task_acquire___one(struct task_struct *task) __ksym __weak;

if we removed __weak from the declaration, symbol resolution would happen during
compilation + linking, at which point there would be no opportunity to do
our ___flavor trickery. But __weak is already used to express "if this kfunc
doesn't exist at all, it's not a problem, don't fail loading the program". So
as of this version of the code, it's not possible to express "one of
bpf_task_acquire___{one,two,three} must resolve, otherwise fail to
load" - that check would have to be done at runtime like

  if (!(bpf_ksym_exists(bpf_task_acquire___one) ||
        bpf_ksym_exists(bpf_task_acquire___two) ||
        bpf_ksym_exists(bpf_task_acquire___three)) {
    /* communicate failure to userspace runner via global var or something */
    return 0;
  }

Maybe something like BTF tags could be used to group a set of __weak
kfunc declarations together such that one (probably _only_ one) of them
must resolve at load time. This would obviate the need for such a runtime
check without causing compile+link step to fail. But also seems overly
complex for now.

Feels useful to have "incompatible resolution" log message even if it doesn't
stop loading process. But because __weak ties ___flavor trickery to "not a 
problem if kfunc doesn't exist at all", probably more accurate to make the
pr_warn a pr_debug if ___flavor AND ext->is_weak. Adding the logic to do that
felt like it would raise more questions than answers to a future reader of the
code, so I didn't add it. Now that I'm writing this out, I think it's better
to add it along with a comment.


>>  		return -EINVAL;
>>  	}
>>  
>> @@ -8370,6 +8384,10 @@ void bpf_object__close(struct bpf_object *obj)
>>  
>>  	zfree(&obj->btf_custom_path);
>>  	zfree(&obj->kconfig);
>> +
>> +	for (i = 0; i < obj->nr_extern; i++)
>> +		zfree(&obj->externs[i].essent_name);
>> +
>>  	zfree(&obj->externs);
>>  	obj->nr_extern = 0;
>>  
>> -- 
>> 2.34.1
>>
>>




[Index of Archives]     [Linux Samsung SoC]     [Linux Rockchip SoC]     [Linux Actions SoC]     [Linux for Synopsys ARC Processors]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]


  Powered by Linux