> On Jul 15, 2022, at 12:59 PM, Steven Rostedt <rostedt@xxxxxxxxxxx> wrote: > > On Fri, 15 Jul 2022 19:49:00 +0000 > Song Liu <songliubraving@xxxxxx> wrote: > >>> >>> What about if we release the lock when doing the callback? >> >> We can probably unlock ftrace_lock here. But we may break locking order >> with direct mutex (see below). > > You're talking about the multi registering case, right? We are using the *_ftrace_direct_multi() API here, to be able to specify ops_func. The direct single API just uses the shared direct_ops. > >> >>> >>> Then we just need to make sure things are the same after reacquiring the >>> lock, and if they are different, we release the lock again and do the >>> callback with the new update. Wash, rinse, repeat, until the state is the >>> same before and after the callback with locks acquired? >> >> Personally, I would like to avoid wash-rinse-repeat here. > > But it's common to do. Keeps your hair cleaner that way ;-) > >> >>> >>> This is a common way to handle callbacks that need to do something that >>> takes the lock held before doing a callback. >>> >>> The reason I say this, is because the more we can keep the accounting >>> inside of ftrace the better. >>> >>> Wouldn't this need to be done anyway if BPF was first and live kernel >>> patching needed the update? An -EAGAIN would not suffice. >> >> prepare_direct_functions_for_ipmodify handles BPF-first-livepatch-later >> case. The benefit of prepare_direct_functions_for_ipmodify() is that it >> holds direct_mutex before ftrace_lock, and keeps holding it if necessary. >> This is enough to make sure we don't need the wash-rinse-repeat. >> >> OTOH, if we wait until __ftrace_hash_update_ipmodify(), we already hold >> ftrace_lock, but not direct_mutex. To make changes to bpf trampoline, we >> have to unlock ftrace_lock and lock direct_mutex to avoid deadlock. >> However, this means we will need the wash-rinse-repeat. What do you think about the prepare_direct_functions_for_ipmodify() approach? If this is not ideal, maybe we can simplify it so that it only holds direct_mutex (when necessary). The benefit is that we are sure direct_mutex is already held in __ftrace_hash_update_ipmodify(). However, I think it is not safe to unlock ftrace_lock in __ftrace_hash_update_ipmodify(). We can get parallel do_for_each_ftrace_rec(), which is dangerous, no? >> >> >> For livepatch-first-BPF-later case, we can probably handle this in >> __ftrace_hash_update_ipmodify(), since we hold both direct_mutex and >> ftrace_lock. We can unlock ftrace_lock and update the BPF trampoline. >> It is safe against changes to direct ops, because we are still holding >> direct_mutex. But, is this safe against another IPMODIFY ops? I am not >> sure yet... Also, this is pretty weird because, we are updating a >> direct trampoline before we finish registering it for the first time. >> IOW, we are calling modify_ftrace_direct_multi_nolock for the same >> trampoline before register_ftrace_direct_multi() returns. >> >> The approach in v2 propagates the -EAGAIN to BPF side, so these are two >> independent calls of register_ftrace_direct_multi(). This does require >> some protocol between ftrace core and its user, but I still think this >> is a cleaner approach. > > The issue I have with this approach is it couples BPF and ftrace a bit too > much. > > But there is a way with my approach you can still do your approach. That > is, have ops_func() return zero if everything is fine, and otherwise returns > a negative value. Then have the register function fail and return whatever > value that gets returned by the ops_func() > > Then have the bpf ops_func() check (does this direct caller handle > IPMODIFY? if yes, return 0, else return -EAGAIN). Then the registering of > ftrace fails with your -EAGAIN, and then you can change the direct > trampoline to handle IPMODIFY and try again. This time when ops_func() is > called, it sees that the direct trampoline can handle the IPMODIFY and > returns 0. > > Basically, it's a way to still implement my suggestion, but let BPF decide > to use -EAGAIN to try again. And then BPF and ftrace don't need to have > these special flags to change the behavior of each other. I like this one. So there is no protocol about the return value here. Thanks, Song