Re: [PATCH v2 1/2] livepatch: Remove duplicate warning about missing reliable stacktrace support

Josh Poimboeuf <jpoimboe@xxxxxxxxxx> · Tue, 7 May 2019 16:24:25 -0500

On Tue, May 07, 2019 at 04:28:47PM +0200, Petr Mladek wrote:
> > > > > Also this check is effectively the same as the klp_have_reliable_stack()
> > > > > check which is done in kernel/livepatch/core.c.  So I think it would be
> > > > > clearer and more consistent if the same check is done here:
> > > > > 
> > > > > 	if (!klp_have_reliable_stack())
> > > > > 		return -ENOSYS;
> > > 
> > > Huh, it smells with over engineering to me.
> > 
> > How so?  It makes the code more readable and the generated code should
> > be much better because it becomes a build-time check.
> 
> save_stack_trace_tsk_reliable() returns various error codes.
> We catch a specific one because otherwise the message below
> might be misleading.
> 
> I do not see why we should prevent this error by calling
> a custom hack: klp_have_reliable_stack()?

I wouldn't call it a hack.  It's a simple build-time check.

> Regarding reliability. If anyone changes semantic of
> save_stack_trace_tsk_reliable() error codes, they would likely
> check if all users (one at the moment) handle it correctly.
> 
> On the other hand, the dependency between the -ENOSYS
> return value and klp_have_reliable_stack() is far from
> obvious.

I don't follow your point.

> If we want to discuss and fix this to the death. We should change
> the return value from -ENOSYS to -EOPNOTSUPP. The reason
> is the same as in the commit 375bfca3459db1c5596
> ("livepatch: core: Return EOPNOTSUPP instead of ENOSYS").
> 
> Note that EOPNOTSUPP is the same errno as ENOTSUP, see
> man 3 errno.

Sure, but that's a separate issue.

> > But I think Miroslav's suggestion to revert 1d98a69e5cef would be even
> > better.
> 
> AFAIK, Miroslav wanted to point out that your opinion was inconsistent.

How is my opinion inconsistent?

Is there something wrong with the original approach, which was reverted
with

  1d98a69e5cef ("livepatch: Remove reliable stacktrace check in klp_try_switch_task()")

?

As I mentioned, that has some advantages:

- The generated code is simpler/faster because it uses a build-time
  check.

- The code is more readable IMO.  Especially if the check is done higher
  up the call stack by reverting 1d98a69e5cef.  That way the arch
  support short-circuiting is done in the logical place, before doing
  any more unnecessary work.  It's faster, but also, more importantly,
  it's less surprising.

- It's also more consistent with other code, since the intent of this
  check is the same as the klp_have_reliable_stack() use in
  klp_enabled_patch().

If you disagree with those, please explain why.

> > > > > 	ret = save_stack_trace_tsk_reliable(task, &trace);
> > > > > 
> > > > > 	[ no need to check ret for ENOSYS here ]
> > > > > 
> > > > > Then, IMO, no comment is needed.
> > > > 
> > > > BTW, if you agree with this approach then we can leave the
> > > > WARN_ON_ONCE() in save_stack_trace_tsk_reliable() after all.
> > > 
> > > I really like the removal of the WARN_ON_ONCE(). I consider
> > > it an old fashioned way used when people are lazy to handle
> > > errors. It might make sense when the backtrace helps to locate
> > > the context but the context is well known here. Finally,
> > > WARN() should be used with care. It might cause reboot
> > > with panic_on_warn.
> > 
> > The warning makes the function consistent with the other weak functions
> > in stacktrace.c and clarifies that it should never be called unless an
> > arch has misconfigured something.  And if we aren't even checking the
> > specific ENOSYS error as I proposed then this warning would make the
> > error more obvious.
> 
> I consider both WARN() and error value as superfluous. I like the
> error value because it allows users to handle the situation as
> they need it.

... but if there are no users of the weak function then the point is
moot.

> Best Regards,
> Petr
> 
> PS: This is my last mail in the thread this week. I will eventually
> return to it with a clear head next week. It is all nitpicking from
> my POV and I have better things to do.

I don't think it's helpful to characterize it as nitpicking.  The
details of the code are important.

-- 
Josh