On Mon, Dec 11, 2023 at 11:08:59AM -0800, Alexei Starovoitov wrote: > On Mon, Dec 11, 2023 at 9:34 AM Anton Protopopov <aspsk@xxxxxxxxxxxxx> wrote: > > > > > > This looks for me that two bits aren't enough, and the third is > > required, as the second bit seems to be overloaded: > > > > * bit 1 indicates that this is a "JA_MAYBE" > > * bit 2 indicates a jump or nop (i.e., the current state) > > > > However, we also need another bit which indicates what to do with the > > instruction when we issue [an abstract] command > > > > flip_branch_on_or_off(branch, 0/1) > > > > Without this information (and in the absense of external meta-data on > > how to patch the branch) we can't determine what a given (BPF, not > > jitted) program currently does. For example, if we issue > > > > flip_branch_on_or_off(branch, 0) > > > > then we can't reflect this in the xlated program by setting the second > > bit to jmp/off. Again, JITted program is fine, but it will be > > desynchronized from xlated in term of logic (some instructions will be > > mapped as NOP -> x86_JUMP, others as NOP -> x86_NOP). > > Not following the need for the 3rd bit. > The 2nd bit is not only the initial state, but the current state too. > > when user space does static_branch_enable it will set the 2nd bit to 1 > (if it wasn't set) and will text_poke_bp the code. > xlated will be always in sync with JITed. > No ambiguity. Ok, from BPF arch perspective this can work with two bits (not for practical purposes though, IMO, see my next e-mail). On the lowest level we have this magic jump instruction JA[SRC_REG=1] +OFF # jits to a NOP JA[SRC_REG=3] +OFF # jits to a JUMP Then we have a primitive kfunc static_branch_set(prog, branch, bool on) Which sets the second bit and pokes jitted code. (Maybe it doesn't set the actual bit in xlated due to read-only memory, as you've mentioned below.) You're right that this is unambiguous. > An annoying part is that bpf insn stream is read only, so we cannot > really write that 2nd bit. We can virtually write it. > Seeing nop or jmp in JITed code would mean the bit is 0 or 1. > So xlated dump will report it. If we can poke jitted x86 code, then we might poke prog->insnsi in the same way as well? Besides, on architectures where static keys may be of interest we don't use interpreter, so maybe this is ok to poke insnsi (and make it rw) after all? > Separately the kernel doesn't have static_branch_switch/flip command. > It's only enable or disable. I think it's better to keep it the same way.