Re: [PATCH 0/4] selftests/nolibc: add user-space 'efault' handler

Willy Tarreau <w@xxxxxx> · Sun, 4 Jun 2023 21:14:29 +0200

On Sun, Jun 04, 2023 at 09:07:25PM +0200, Thomas Weißschuh wrote:
> On 2023-06-04 13:05:18+0200, Willy Tarreau wrote:
> > Hi Zhangjin,
> > 
> > On Tue, May 30, 2023 at 06:47:38PM +0800, Zhangjin Wu wrote:
> > > Hi, Willy, Thomas
> > > 
> > > This is not really for merge, but only let it work as a demo code to
> > > test whether it is possible to restore the next test when there is a bad
> > > pointer access in user-space [1].
> > > 
> > > Besides, a new 'run' command is added to 'NOLIBC_TEST' environment
> > > variable or arguments to control the running iterations, this may be
> > > used to test the reentrancy issues, but no failures found currently ;-)
> > 
> > Since the tests we're running are essentially API tests, I'm having
> > a hard time seeing in which case it can be useful to repeat the tests.
> > I'm not necessarily against doing it, I'm used to repeating tests for
> > example in anything sensitive to timing or race conditions, it's just
> > that here I'm not seeing the benefit. And the fact you found no failure
> > is rather satisfying because the opposite would have surprised me.
> > 
> > Regarding the efault handler, I don't think it's a good idea until we
> > have signal+longjmp support in nolibc. Because running different tests
> > with different libcs kind of defeats the purpose of the test in the
> > first place. The reason why I wanted nolibc-test to be portable to at
> > least one other libc is to help the developer figure if a failure is in
> > the nolibc syscall they're implementing or in the test itself. Here if
> > we start to say that some parts cannot be tested similarly, the benefit
> > disappears.
> > 
> > I mentioned previously that I'm not particularly impatient to work on
> > signals and longjmp. But in parallel I understand how this can make the
> > life of some developers easier and even allow to widen the spectrum of
> > some tests. Thus, maybe in the end it could be beneficial to make progress
> > on this front and support these. We should make sure that this doesn't
> > inflate the code base however. I guess I'd be fine with ignoring libc-
> > based restarts on EINTR, alt stacks and so on and keeping this minimal
> > (i.e. catch a segfault/bus error/sigill in a test program, or a Ctrl-C
> > in a tiny shell).
> > 
> > Just let us know if you think that's something you could be interested
> > in exploring. There might be differences between architectures, I have
> > not checked.
> 
> If the goal is to handle hard errors like segfaults more gracefully,
> would it not be easier to run each testcase in a subprocess?
> 
> Then we can just check if the child exited successfully.
> 
> It should also be completely architecture agnostic.

Could be, indeed. However it would complexify a bit strace debugging,
but yeah that might be something to think about.

Willy