Re: [PATCH v2 bpf-next 5/5] bpf: trampoline: support FTRACE_OPS_FL_SHARE_IPMODIFY

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




> On Jul 6, 2022, at 12:38 PM, Steven Rostedt <rostedt@xxxxxxxxxxx> wrote:
> 
> On Thu, 2 Jun 2022 12:37:06 -0700
> Song Liu <song@xxxxxxxxxx> wrote:
> 
> 
>> --- a/kernel/bpf/trampoline.c
>> +++ b/kernel/bpf/trampoline.c
>> @@ -27,6 +27,44 @@ static struct hlist_head trampoline_table[TRAMPOLINE_TABLE_SIZE];
>> /* serializes access to trampoline_table */
>> static DEFINE_MUTEX(trampoline_mutex);
>> 
>> +#ifdef CONFIG_DYNAMIC_FTRACE_WITH_DIRECT_CALLS
>> +static int bpf_trampoline_update(struct bpf_trampoline *tr, bool lock_direct_mutex);
>> +
>> +static int bpf_tramp_ftrace_ops_func(struct ftrace_ops *ops, enum ftrace_ops_cmd cmd)
>> +{
>> +	struct bpf_trampoline *tr = ops->private;
>> +	int ret;
>> +
>> +	/*
>> +	 * The normal locking order is
>> +	 *    tr->mutex => direct_mutex (ftrace.c) => ftrace_lock (ftrace.c)
>> +	 *
>> +	 * This is called from prepare_direct_functions_for_ipmodify, with
>> +	 * direct_mutex locked. Use mutex_trylock() to avoid dead lock.
>> +	 * Also, bpf_trampoline_update here should not lock direct_mutex.
>> +	 */
>> +	if (!mutex_trylock(&tr->mutex))
> 
> Can you comment here that returning -EAGAIN will not cause this to repeat.
> That it will change things where the next try will not return -EGAIN?

Hmm.. this is not the guarantee here. This conflict is a real race condition 
that an IPMODIFY function (i.e. livepatch) is being registered at the same time 
when something else, for example bpftrace, is updating the BPF trampoline. 

This EAGAIN will propagate to the user of the IPMODIFY function (i.e. livepatch),
and we need to retry there. In the case of livepatch, the retry is initiated 
from user space. 

> 
>> +		return -EAGAIN;
>> +
>> +	switch (cmd) {
>> +	case FTRACE_OPS_CMD_ENABLE_SHARE_IPMODIFY:
>> +		tr->indirect_call = true;
>> +		ret = bpf_trampoline_update(tr, false /* lock_direct_mutex */);
>> +		break;
>> +	case FTRACE_OPS_CMD_DISABLE_SHARE_IPMODIFY:
>> +		tr->indirect_call = false;
>> +		tr->fops->flags &= ~FTRACE_OPS_FL_SHARE_IPMODIFY;
>> +		ret = bpf_trampoline_update(tr, false /* lock_direct_mutex */);
>> +		break;
>> +	default:
>> +		ret = -EINVAL;
>> +		break;
>> +	};
>> +	mutex_unlock(&tr->mutex);
>> +	return ret;
>> +}
>> +#endif
>> +
>> 
> 
> 
>> @@ -330,7 +387,7 @@ static struct bpf_tramp_image *bpf_tramp_image_alloc(u64 key, u32 idx)
>> 	return ERR_PTR(err);
>> }
>> 
>> -static int bpf_trampoline_update(struct bpf_trampoline *tr)
>> +static int bpf_trampoline_update(struct bpf_trampoline *tr, bool lock_direct_mutex)
>> {
>> 	struct bpf_tramp_image *im;
>> 	struct bpf_tramp_links *tlinks;
>> @@ -363,20 +420,45 @@ static int bpf_trampoline_update(struct bpf_trampoline *tr)
>> 	if (ip_arg)
>> 		flags |= BPF_TRAMP_F_IP_ARG;
>> 
>> +#ifdef CONFIG_DYNAMIC_FTRACE_WITH_DIRECT_CALLS
>> +again:
>> +	if (tr->indirect_call)
>> +		flags |= BPF_TRAMP_F_ORIG_STACK;
>> +#endif
>> +
>> 	err = arch_prepare_bpf_trampoline(im, im->image, im->image + PAGE_SIZE,
>> 					  &tr->func.model, flags, tlinks,
>> 					  tr->func.addr);
>> 	if (err < 0)
>> 		goto out;
>> 
>> +#ifdef CONFIG_DYNAMIC_FTRACE_WITH_DIRECT_CALLS
>> +	if (tr->indirect_call)
>> +		tr->fops->flags |= FTRACE_OPS_FL_SHARE_IPMODIFY;
>> +#endif
>> +
>> 	WARN_ON(tr->cur_image && tr->selector == 0);
>> 	WARN_ON(!tr->cur_image && tr->selector);
>> 	if (tr->cur_image)
>> 		/* progs already running at this address */
>> -		err = modify_fentry(tr, tr->cur_image->image, im->image);
>> +		err = modify_fentry(tr, tr->cur_image->image, im->image, lock_direct_mutex);
>> 	else
>> 		/* first time registering */
>> 		err = register_fentry(tr, im->image);
>> +
>> +#ifdef CONFIG_DYNAMIC_FTRACE_WITH_DIRECT_CALLS
>> +	if (err == -EAGAIN) {
>> +		if (WARN_ON_ONCE(tr->indirect_call))
>> +			goto out;
>> +		/* should only retry on the first register */
>> +		if (WARN_ON_ONCE(tr->cur_image))
>> +			goto out;
>> +		tr->indirect_call = true;
>> +		tr->fops->func = NULL;
>> +		tr->fops->trampoline = 0;
>> +		goto again;
> 
> I'm assuming that the above will prevent a return of -EAGAIN again. As if
> it can, then this could turn into a dead lock.
> 
> Can you please comment that?

This is a different case. This EAGAIN happens when the live patch came first, 
and we register bpf trampoline later. By enabling tr->indirect_call, we should
not get EAGAIN from register_fentry. 

I will add more comments for both cases. 

Thanks,
Song




[Index of Archives]     [Linux Samsung SoC]     [Linux Rockchip SoC]     [Linux Actions SoC]     [Linux for Synopsys ARC Processors]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]


  Powered by Linux