On Thu, Oct 10, 2019 at 07:38:47PM -0700, Andrii Nakryiko wrote: > Existing BPF_CORE_READ() macro generates slightly suboptimal code. If > there are intermediate pointers to be read, initial source pointer is > going to be assigned into a temporary variable and then temporary > variable is going to be uniformly used as a "source" pointer for all > intermediate pointer reads. Schematically (ignoring all the type casts), > BPF_CORE_READ(s, a, b, c) is expanded into: > ({ > const void *__t = src; > bpf_probe_read(&__t, sizeof(*__t), &__t->a); > bpf_probe_read(&__t, sizeof(*__t), &__t->b); > > typeof(s->a->b->c) __r; > bpf_probe_read(&__r, sizeof(*__r), &__t->c); > }) > > This initial `__t = src` makes calls more uniform, but causes slightly > less optimal register usage sometimes when compiled with Clang. This can > cascase into, e.g., more register spills. > > This patch fixes this issue by generating more optimal sequence: > ({ > const void *__t; > bpf_probe_read(&__t, sizeof(*__t), &src->a); /* <-- src here */ > bpf_probe_read(&__t, sizeof(*__t), &__t->b); > > typeof(s->a->b->c) __r; > bpf_probe_read(&__r, sizeof(*__r), &__t->c); > }) > > Fixes: 7db3822ab991 ("libbpf: Add BPF_CORE_READ/BPF_CORE_READ_INTO helpers") > Signed-off-by: Andrii Nakryiko <andriin@xxxxxx> Applied, thanks!