Alexei Starovoitov wrote: > On Tue, Jan 21, 2020 at 10:15:42AM -0800, John Fastabend wrote: > > Alexei Starovoitov wrote: > > > On Mon, Jan 20, 2020 at 11:36:43PM -0800, John Fastabend wrote: > > > > > > > > > + > > > > > + t1 = btf_type_skip_modifiers(btf1, t1->type, NULL); > > > > > + t2 = btf_type_skip_modifiers(btf2, t2->type, NULL); > > > > > > > > Is it really best to skip modifiers? I would expect that if the > > > > signature is different including modifiers then we should just reject it. > > > > OTOH its not really C code here either so modifiers may not have the same > > > > meaning. With just integers and struct it may be ok but if we add pointers > > > > to ints then what would we expect from a const int*? > > > > > > > > So whats the reasoning for skipping modifiers? Is it purely an argument > > > > that its not required for safety so solve it elsewhere? In that case then > > > > checking names of functions is also equally not required. > > > > > > Function names are not checked by the kernel. It's purely libbpf and bpf_prog.c > > > convention. The kernel operates on prog_fd+btf_id only. The names of function > > > arguments are not compared either. > > > > Sorry mistyped names of struct is what I meant, but that is probably nice to > > have per comment. > > I think comments in the kernel code won't be sufficient :) I'm planing to write > a proper doc describing global functions and program extensions, since these > concepts are coupled. > > > > > > > The code has to skip modifiers. Otherwise the type comparison algorithm will be > > > quite complex, since typedef is such modifier. Like 'u32' in original program > > > and 'u32' in extension program would have to be recursively checked. > > > > > > Another reason to skip modifiers is 'volatile' modifier. I suspect we would > > > have to use it from time to time in original placeholder functions. Yet new > > > replacement function will be written without volatile. The placeholder may need > > > volatile to make sure compiler doesn't optimize things away. I found cases > > > where 'noinline' in placeholder was not enough. clang would still inline the > > > body of the function and remove call instruction. So far I've been using > > > volatile as a workaround. May be we will introduce new function attribute to > > > clang. > > > > Yes, we have various similar issue and have in the past used volatile to work > > around them but volatile's inside loops tends to break loop optimizations and > > cause clang warnings/errors. Another common one is verifier failing to track > > when scalars move around in registers. As an example the following is valid > > C for a bounded additon to array pointer but not tractable for the verifier > > at the moment. (made example at some point I'll dig up a collection of > > real-world examples) > > > > r1 = *(u64 *)(r10 - 8) > > r6 = r1 > > if r6 < %[const] goto %l[err] > > r3 += r1 > > r2 = %[copy_size] > > r1 = r7 > > call 4 > > Is it a sign compare in the above? The verifier currently struggles with sign > compares. The issue you're describing could also be related to instruction > combining optimization. Yonghong is working on undoing parts of instcombine here: > https://reviews.llvm.org/D72787 > Dealing with 32-bit sign compares probably will be done as a verifier > improvement. It's too hacky to do it in the llvm. hmm I think it was an unsigned compare but I'll try to come up with some simple examples of real code this week and post them. To illustrate some of the verifier complaints we've been working around. > > > compiler barriers help but not always and also breaks loop optimization > > passes. But, thats a different discussion I only mention it because > > either verifier has to track above logic better or new attributes in clang > > could be used for these things. But the new attributes don't usually work > > well when mixed with optimization passes that we would actually like to > > keep. > > yep. unfortunately llvm folks didn't like the idea of disabling individual > optimization passes or parts of passes based on the backend flag. Hence BPF > backend has to normalize IR before generating asm. Since all of the issues > related to "llvm optimizing too much and verifier is too dumb" are related to > missed tracking of registers, I'm thinking that we may try virtual register > approach. Instead of doing register allocation and spill/fill in the llvm the > backend may generate as many virtual registers as necessary and final BPF > assembly will look like SSA except phisical registers will still be used to > receive and pass arguments while all other math, branches will operate on > virtual registers. Then the kernel will do register alloc after verification. I > hope that should help the verifier get smarter at the expense of less efficient > code and more spill/fill. But that is the last resort. Interesting. Or maybe follow the parent chain and push information down the chain when we see register moves and further bounds refining. > > > > > > > Having said that I share your concern regarding skipping 'const'. For 'const > > > int arg' it's totally ok to skip it, since it's meaningless from safety pov, > > > but for 'const int *arg' and 'const struct foo *arg' I'm planning to preserve > > > it. It will be preserved at the verifier bpf_reg_state level though. Just > > > checking that 'const' is present in extension prog's BTF doesn't help safety. > > > I'm planing to make the verifier enforce that bpf prog cannot write into > > > argument which type is pointer to const struct. That part is still wip. It will > > > be implemented for global functions first and then for extension programs. > > > Currently the verifier rejects any pointer to struct (other than context), so > > > no backward compatibility issues. > > > > Ah ok this will be great. In that case const will be more general then > > merely functions and should be applicable generally at least as an end > > goal IMO. There will be a slight annoyance where old extensions may not > > run on new kernels though. I will argue such extensions are broken though. > > 'old extensions' won't run? I don't see it. Strict 'const' enforcement > for pointers to struct won't break things. Currently pointers to struct > are not supported. Only a pointer to one specific struct which is context. > That's different. I don't think there will be any compatibility issues. Right its not an issue. > > > For this patch then, > > > > Acked-by: John Fastabend <john.fastabend@xxxxxxxxx> > > Thanks for the review!