On Tue, Dec 1, 2020 at 9:53 PM John Fastabend <john.fastabend@xxxxxxxxx> wrote: > > Yonghong Song wrote: > > > > > > [...] > > > > Great, this means that all existing valid uses of > > > __sync_fetch_and_add() will generate BPF_XADD instructions and will > > > work on old kernels, right? > > > > That is correct. > > > > > > > > If that's the case, do we still need cpu=v4? The new instructions are > > > *only* going to be generated if the user uses previously unsupported > > > __sync_fetch_xxx() intrinsics. So, in effect, the user consciously > > > opts into using new BPF instructions. cpu=v4 seems like an unnecessary > > > tautology then? > > > > This is a very good question. Essentially this boils to when users can > > use the new functionality including meaningful return value of > > __sync_fetch_and_add(). > > (1). user can write a small bpf program to test the feature. If user > > gets a failed compilation (fatal error), it won't be supported. > > Otherwise, it is supported. > > (2). compiler provides some way to tell user it is safe to use, e.g., > > -mcpu=v4, or some clang macro suggested by Brendan earlier. > > > > I guess since kernel already did a lot of feature discovery. Option (1) > > is probably fine. > > For option (2) we can use BTF with kernel version check. If kernel is > greater than kernel this lands in we use the the new instructions if > not we use a slower locked version. That should work for all cases > unless someone backports patches into an older case. Two different things: Clang support detection and kernel support detection. You are talking about kernel detection, I and Yonghong were talking about Clang detection and explicit cpu=v4 opt-in. For kernel detection, if there is an enum value or type that gets added along the feature, then with CO-RE built-ins it's easy to detect and kernel dead code elimination will make sure that unsupported instructions won't trip up the BPF verifier. Still need Clang support to compile the program in the first place, though. If there is no such BTF-based way to check, it is still possible to try to load a trivial BPF program with __sync_fech_and_xxx() to do feature detection and then use .rodata to turn off code paths relying on a new instruction set. > > At least thats what I'll probably end up wrapping in a helper function.