On Sat, Nov 28, 2020 at 5:56 PM Alexei Starovoitov <alexei.starovoitov@xxxxxxxxx> wrote: > > On Fri, Nov 20, 2020 at 06:46:10PM -0800, Andrii Nakryiko wrote: > > > > @@ -52,12 +53,19 @@ struct bpf_reg_state { > > */ > > struct bpf_map *map_ptr; > > > > - u32 btf_id; /* for PTR_TO_BTF_ID */ > > + /* for PTR_TO_BTF_ID */ > > + struct { > > + struct btf *btf; > > + u32 btf_id; > > + }; > > bpf_reg_state is the main structure contributing to the verifier memory consumption. > Is it possible to do the tracking without growing it? The only way to keep this at 8 bytes in the existing union is to use ID for BTF, but that has tons of problems: need to do look up all the time, plus there is now a possibility of that BTF instance going away (e.g., if kernel module is unloaded), etc. Pain. But, I just looked at bpf_reg_state with pahole, and there are two 4-byte holes: before this union and after ref_obj_id. So if I move "off" to before the union, the overall size of the struct won't change, even if the union itself grows to 16 bytes. And it won't break states_equal() logic, from what I can see. So with that, bpf_reg_state BEFORE: struct bpf_reg_state { enum bpf_reg_type type; /* 0 4 */ /* XXX 4 bytes hole, try to pack */ union { int range; /* 8 4 */ struct bpf_map * map_ptr; /* 8 8 */ u32 btf_id; /* 8 4 */ u32 mem_size; /* 8 4 */ long unsigned int raw; /* 8 8 */ }; /* 8 8 */ s32 off; /* 16 4 */ u32 id; /* 20 4 */ u32 ref_obj_id; /* 24 4 */ /* XXX 4 bytes hole, try to pack */ struct tnum var_off; /* 32 16 */ s64 smin_value; /* 48 8 */ s64 smax_value; /* 56 8 */ /* --- cacheline 1 boundary (64 bytes) --- */ u64 umin_value; /* 64 8 */ u64 umax_value; /* 72 8 */ s32 s32_min_value; /* 80 4 */ s32 s32_max_value; /* 84 4 */ u32 u32_min_value; /* 88 4 */ u32 u32_max_value; /* 92 4 */ struct bpf_reg_state * parent; /* 96 8 */ u32 frameno; /* 104 4 */ s32 subreg_def; /* 108 4 */ enum bpf_reg_liveness live; /* 112 4 */ bool precise; /* 116 1 */ /* size: 120, cachelines: 2, members: 19 */ /* sum members: 109, holes: 2, sum holes: 8 */ /* padding: 3 */ /* last cacheline: 56 bytes */ }; And with BTF pointer AFTER: struct bpf_reg_state { enum bpf_reg_type type; /* 0 4 */ s32 off; /* 4 4 */ union { int range; /* 8 4 */ struct bpf_map * map_ptr; /* 8 8 */ struct { struct btf * btf; /* 8 8 */ u32 btf_id; /* 16 4 */ }; /* 8 16 */ u32 mem_size; /* 8 4 */ struct { long unsigned int raw1; /* 8 8 */ long unsigned int raw2; /* 16 8 */ } raw; /* 8 16 */ }; /* 8 16 */ u32 id; /* 24 4 */ u32 ref_obj_id; /* 28 4 */ struct tnum var_off; /* 32 16 */ s64 smin_value; /* 48 8 */ s64 smax_value; /* 56 8 */ /* --- cacheline 1 boundary (64 bytes) --- */ u64 umin_value; /* 64 8 */ u64 umax_value; /* 72 8 */ s32 s32_min_value; /* 80 4 */ s32 s32_max_value; /* 84 4 */ u32 u32_min_value; /* 88 4 */ u32 u32_max_value; /* 92 4 */ struct bpf_reg_state * parent; /* 96 8 */ u32 frameno; /* 104 4 */ s32 subreg_def; /* 108 4 */ enum bpf_reg_liveness live; /* 112 4 */ bool precise; /* 116 1 */ /* size: 120, cachelines: 2, members: 19 */ /* padding: 3 */ /* last cacheline: 56 bytes */ }; No more holes, but the same overall size. Does that work? > > > > > u32 mem_size; /* for PTR_TO_MEM | PTR_TO_MEM_OR_NULL */ > > > > /* Max size from any of the above. */ > > - unsigned long raw; > > + struct { > > + unsigned long raw1; > > + unsigned long raw2; > > + } raw;