On Thu, Nov 03, 2022 at 08:25:28PM -0700, Stanislav Fomichev wrote: > + /* if (r5 == NULL) return; */ > + BPF_JMP_IMM(BPF_JNE, BPF_REG_5, 0, S16_MAX), S16_MAX jump crashes my system and I do not see such jumps used very often in bpf code found in-tree, setting a fixed jump length worked for me. Also, I think BPF_JEQ is a correct condition in this case, not BPF_JNE. But the main reason for my reply is that I have implemented RX hash hint for ice both as unrolled bpf code and with BPF_EMIT_CALL [0]. Both bpf_xdp_metadata_rx_hash() and bpf_xdp_metadata_rx_hash_supported() are implemented in those 2 ways. RX hash is the easiest hint to read, so performance difference should be more visible than when reading timestapm. Counting packets in an rxdrop XDP program on a single queue gave me the following numbers: - unrolled: 41264360 pps - BPF_EMIT_CALL: 40370651 pps So, reading a single hint in an unrolled way instead of calling 2 driver functions in a row, gives us a 2.2% performance boost. Surely, the difference will increase, if we read more than a single hint. Therefore, it would be great to implement at least some simple hints functions as unrolled. [0] https://github.com/walking-machine/linux/tree/ice-kfunc-hints-clean