On Thu, 2023-11-09 at 11:54 -0800, Yonghong Song wrote: > On 11/9/23 3:47 AM, Eduard Zingerman wrote: > > On Wed, 2023-11-08 at 21:30 -0800, Yonghong Song wrote: > > > With latest llvm18 (main branch of llvm-project repo), when building bpf selftests, > > > [~/work/bpf-next (master)]$ make -C tools/testing/selftests/bpf LLVM=1 -j > > > > > > The following compilation error happens: > > > fatal error: error in backend: Branch target out of insn range > > > ... > > > Stack dump: > > > 0. Program arguments: clang -g -Wall -Werror -D__TARGET_ARCH_x86 -mlittle-endian > > > -I/home/yhs/work/bpf-next/tools/testing/selftests/bpf/tools/include > > > -I/home/yhs/work/bpf-next/tools/testing/selftests/bpf -I/home/yhs/work/bpf-next/tools/include/uapi > > > -I/home/yhs/work/bpf-next/tools/testing/selftests/usr/include -idirafter > > > /home/yhs/work/llvm-project/llvm/build.18/install/lib/clang/18/include -idirafter /usr/local/include > > > -idirafter /usr/include -Wno-compare-distinct-pointer-types -DENABLE_ATOMICS_TESTS -O2 --target=bpf > > > -c progs/pyperf180.c -mcpu=v3 -o /home/yhs/work/bpf-next/tools/testing/selftests/bpf/pyperf180.bpf.o > > > 1. <eof> parser at end of file > > > 2. Code generation > > > ... > > > > > > The compilation failure only happens to cpu=v2 and cpu=v3. cpu=v4 is okay > > > since cpu=v4 supports 32-bit branch target offset. > > > > > > The above failure is due to upstream llvm patch [1] where some inlining behavior > > > are changed in llvm18. > > > > > > To workaround the issue, previously all 180 loop iterations are fully unrolled. > > > Now, the fully unrolling count is changed to 90 for llvm18 and later. This reduced > > > some otherwise long branch target distance, and fixed the compilation failure. > > > > > > [1] https://github.com/llvm/llvm-project/commit/1a2e77cf9e11dbf56b5720c607313a566eebb16e > > > > > > Signed-off-by: Yonghong Song <yonghong.song@xxxxxxxxx> > > Can confirm, the issue is present on clang main w/o this patch and > > disappears after this patch. > > > > Yonghong, is there a way to keep original UNROLL_COUNT if cpuv4 is used? > > I thought about this but a little bit lazy so not giving it enough throught. > But since you mentioned this, I think adding a macro to indicate cpu version > by llvm is a good idea. This will give bpf developers some flexibility to > add new features (new cpu variant) or workaround bugs (for a particular cpu variant > but not impacting others if they are fine), etc. > > So here is the llvm patch: https://github.com/llvm/llvm-project/pull/71856 Thank you, tried it locally, works as expected.