On 26/03/2019 18:44, Edward Cree wrote:
On 26/03/2019 18:05, Jiong Wang wrote:
eBPF ISA specification requires high 32-bit cleared when low 32-bit
sub-register is written. This applies to destination register of ALU32 etc.
JIT back-ends must guarantee this semantic when doing code-gen.
x86-64 and arm64 ISA has the same semantic, so the corresponding JIT
back-end doesn't need to do extra work. However, 32-bit arches (arm, nfp
etc.) and some other 64-bit arches (powerpc, sparc etc), need explicit zero
extension sequence to meet such semantic.
This is important, because for code the following:
u64_value = (u64) u32_value
... other uses of u64_value
compiler could exploit the semantic described above and save those zero
extensions for extending u32_value to u64_value. Hardware, runtime, or BPF
JIT back-ends, are responsible for guaranteeing this. Some benchmarks show
~40% sub-register writes out of total insns, meaning ~40% extra code-gen (
could go up to more for some arches which requires two shifts for zero
extension) because JIT back-end needs to do extra code-gen for all such
instructions.
However this is not always necessary in case u32_value is never cast into
a u64, which is quite normal in real life program. So, it would be really
good if we could identify those places where such type cast happened, and
only do zero extensions for them, not for the others. This could save a lot
of BPF code-gen.
Algo:
- Record indices of instructions that do sub-register def (write). And
these indices need to stay with function state so path pruning and bpf
to bpf function call could be handled properly.
These indices are kept up to date while doing insn walk.
- A full register read on an active sub-register def marks the def insn as
needing zero extension on dst register.
- A new sub-register write overrides the old one.
A new full register write makes the register free of zero extension on
dst register.
- When propagating register read64 during path pruning, it also marks def
insns whose defs are hanging active sub-register, if there is any read64
from shown from the equal state.
Reviewed-by: Jakub Kicinski <jakub.kicinski@xxxxxxxxxxxxx>
Signed-off-by: Jiong Wang <jiong.wang@xxxxxxxxxxxxx>
---
include/linux/bpf_verifier.h | 4 +++
kernel/bpf/verifier.c | 85 +++++++++++++++++++++++++++++++++++++++++---
2 files changed, 84 insertions(+), 5 deletions(-)
diff --git a/include/linux/bpf_verifier.h b/include/linux/bpf_verifier.h
index 27761ab..0ae9a3f 100644
--- a/include/linux/bpf_verifier.h
+++ b/include/linux/bpf_verifier.h
@@ -181,6 +181,9 @@ struct bpf_func_state {
*/
u32 subprogno;
+ /* tracks subreg definition. */
Ideally this comment should mention that the stored value is the insn_idx
of the writing insn. Perhaps also that this is safe because patching
(bpf_patch_insn_data()) only happens after main verification completes.
OK, will add more comments for this.
Regards,
Jiong