On Tue, Oct 26, 2021 at 8:44 AM Stanislav Fomichev <sdf@xxxxxxxxxx> wrote: > > On Mon, Oct 25, 2021 at 5:06 PM YiFei Zhu <zhuyifei@xxxxxxxxxx> wrote: > > > > On Wed, Oct 20, 2021 at 4:28 PM Andrii Nakryiko > > <andrii.nakryiko@xxxxxxxxx> wrote: > > > > > > it's subjective, but "bpf_export_errno" name is quite confusing. What > > > are we "exporting" and where? > > > > > > I actually like Song's proposal for two helpers, > > > bpf_set_err()/bpf_get_err(). It makes the semantics less confusing. I > > > honestly don't remember the requirement to have one combined helper > > > from the BPF office hour discussion, but if there was a good reason > > > for that, please remind us. > > > > > > > + * Description > > > > + * If *errno_val* is positive, set the syscall's return error code; > > > > > > This inversion of error code is also confusing. If we are to return > > > -EXXX, bpf_set_err(EXXX) is quite confusing. > > > > > > > + * if *errno_val* is zero, retrieve the previously set code. > > > > > > Also, are there use cases where zero is the valid "error" (or lack of > > > it, rather). I.e., wouldn't there be cases where you want to clear a > > > previous error? We might have discussed this, sorry if I forgot. > > > > Hmm, originally I thought it's best to assume the underlying > > assumption is that filters may set policies and it would violate it if > > policies become ignored; however one could argue that debugging would > > be a use case for an error-clearing filter. > > > > Let's say we do bpf_set_err()/bpf_get_err(), with the ability to clear > > errors. I'm having trouble thinking of the best way to have it > > interact with the getsockopt "retval" in its context: > > * Let's say the kernel initially sets an error code in the retval. I > > think it would be a surprising behavior if only "retval" but not > > bpf_get_err() shows the error. Therefore we'd need to initialize "err" > > with the "retval" if retval is an error. > > * If we initialize "err" with the "retval", then for a prog to clear > > the error they'd need to clear it twice, once with bpf_set_err(0) with > > and another with ctx->retval = 0. This will immediately break backward > > compatibility. Therefore, we'd need to mirror the setting of > > ctx->retval = 0 to bpf_set_err(0) > > * In that case, what to do if a user uses ctx->retval as a way to pass > > data between filters? I mean, whether ctx->retval is set to 0 or the > > original is only checked after all filters are run. It could be any > > value while the filters are running. > > * A second issue, if we have first a legacy filter that returns 0 to > > set EPERM, and then there's another filter that does a ctx->retval = > > 0. The original behavior would be that the syscall fails with EPERM, > > but if we mirror ctx->retval = 0 to bpf_set_err(0), then that EPERM > > would be cleared. > > > > One of the reasons I liked "export" is that it's slightly clearer that > > this value is strictly from the BPF's side and has nothing to do with > > what the kernel sets (as in the getsockopt case). But yeah I agree > > it's not an ideal name. > > For getsockopt, maybe the best way to go is to point ctx->retval to > run_ctx.errno_val? (i.e., bpf_set_err would be equivalent to doing > ctx->retval = x;). We can leave ctx->retval as a backwards-compatible > legacy way of doing things. For new programs, bpf_set_err would work > universally, regardless of attach type. Any cons here? Is it a concern that AFAICT getsockopt retval may be a positive number whereas the err here must be non-negative? Also the fourth point still stands. If any getsockopt returns 0, original behavior is return -EPERM whereas new behavior, clearing retval will clear -EPERM. YiFei Zhu > > > But either way, if bpf_set_err() accepted <= 0 and used that as error > > > value as-is (> 0 should be rejected, probably) that would make for > > > straightforward logic. Then for getting the current error we can have > > > a well-paired bpf_get_err()? > > > > > > > > > BTW, "errno" is very strongly associated with user-space errno, do we > > > want to have this naming association (this is the reason I used "err" > > > terminology above). > > > > Ack. > > > > YiFei Zhu