On Wed, Oct 26, 2022 at 6:14 PM Martin KaFai Lau <martin.lau@xxxxxxxxx> wrote: > > The cgroup-bpf {get,set}sockopt prog is useful to change the optname behavior. > The bpf prog usually just handles a few specific optnames and ignores most > others. For the optnames that it ignores, it usually does not need to change > the optlen. The exception is when optlen > PAGE_SIZE (or optval_end - optval). > The bpf prog needs to set the optlen to 0 for this case or else the kernel will > return -EFAULT to the userspace. It is usually not what the bpf prog wants > because the bpf prog only expects error returning to userspace when it has > explicitly 'return 0;' or used bpf_set_retval(). If a bpf prog always changes > optlen for optnames that it does not care to 0, it may risk if the latter bpf > prog in the same cgroup may want to change/look-at it. > > Would like to explore if there is an easier way for the bpf prog to handle it. > eg. does it make sense to track if the bpf prog has changed the ctx->optlen > before returning -EFAULT to the user space when ctx.optlen > max_optlen? Good point on chaining being broken because of this requirement :-/ With tracking, we need to be careful, because the following situation might be problematic: Suppose setsockopt is larger than 4k, the program can rewrite some byte in the first 4k, not touch optlen and expect this to work. Currently, optlen=0 explicitly means "ignore whatever is in the bpf buffer and use the original one". If we can have a tracking that catches situations like this - we should be able to drop that optlen=0 requirement. IIRC, that's the only tricky part.