On Thu, 2023-08-31 at 18:08 +0200, Florian Westphal wrote: > Thomas Haller <thaller@xxxxxxxxxx> wrote: > > Allow to opt-out from the have-real-root check via > > > > NFT_TEST_ROOTLESS=1 ./run-tests.sh > > I don't like this. But its a step in the right direction. > > To me run-tests.sh has following issues/pain points: > - test duration is huge (>10m with debug kernels) > - all tests run in same netns > - tries to unloads kernel modules after each test > > The need for uid 0 wasn't big on my problem list so far because > I mostly run the tests in a VM. But I agree its an issue for > auto-build systems / CI and the like. It also means there is not simple "run-tests" script without setting up a VM/container. I personally avoid to run `sudo some/script` from the internet (although, I execute build scripts from the internet are normal user). Running a substantial subset of tests as non-root, seems essential to me. It may not be essential to everybody, but to some. > > > For that to be useful, we must also unshare the PID and user > > namespace > > and map the root user inside that namespace. > > Are you sure PIDNS unshare is needed for this? Probably not, but does it hurt? I think it should be unshare -f -p -U -n --mount-proc --map-root-user \ bash -xc 'whoami; ip link; sleep 1 & ps aux' > > > Test that don't work without real root should check for > > [ "$NFT_TEST_HAVE_REALROOT" != 1 ] and skip gracefully. > > Thats fine, see my recent RFC to add such environment > variables to check if a particular feature is supported or not. > > What I don't like here is the NFT_TEST_ROOTLESS environment > variable to alter behaviour of run-tests.sh behavior, but see below. If you don't run with real-root, then you are maybe not testing the full thing and miss something. Hence, you have to opt-in to the new rootless functionality with NFT_TEST_ROOTLESS=1. The check is to preserve the previous behavior, which failed without real-root. > > -if [ "$(id -u)" != "0" ] ; then > > +if [ "$NFT_TEST_HAVE_REALROOT" = "" ] ; then > > + # The caller can set NFT_TEST_HAVE_REALROOT to indicate us > > whether we > > + # have real root. They usually don't need, and we detect it > > now based > > + # on `id -u`. Note that we may unshare below, so the check > > inside the > > + # new namespace won't be conclusive. We thus only detect > > once and export > > + # the result. > > + export NFT_TEST_HAVE_REALROOT="$(test "$(id -u)" = "0" && > > echo 1 || echo 0)" > > +fi > > + > > Why not get rid of the check? Just auto-switch to unpriv userns and > error out if that fails. You could just print a warning/notice here > and > then try userns mode. And/or print a notice at the together with the > test summary. Which check? About NFT_TEST_HAVE_REALROOT? We need that, to detect real-root before unshare. Since we already need it internally, it's also documented so that the user could override it. For example, if you develop inside a podman run --privileged -ti fedora:latest then `id -u` would wrongly indicate that the test has real-root. You can override that with `export NFT_TEST_HAVE_REALROOT=0` to run in rootless mode. > > > +if [ "$NFT_TEST_NO_UNSHARE" = 1 ]; then > > + # The user opts-out from unshare. Proceed without. > > Whats the use case? If there is a good one, then i'd prefer a > command > line switch rather than environment. I think command line switches are inferior. They are cumbersome to implement (in the script, the order of options surprisingly matters). They are also more limited in their usage. The script should integrate with a `make check-xyz` step or be called from others scripts, where the command line is not directly editable. Also, the name NFT_TEST_NO_UNSHARE is something unique that you can grep for (unlike the "-g" command line switch). The variable NFT_TEST_NO_UNSHARE is there, because the previous code also expected a failure to `unshare`, then printed a warning and proceeded without separate namespace. I think that fallback is undesirable, I would not want to run the test accidentally in the main namespace. Now, by default it always tries to unshare, and aborts on failure. The variable is there to opt-out from that. > > I think long term all of the following would be good to have: > > 1. run each test in its own netns > 2. get rid of the forced 'nft flush ruleset' and the rmmod calls > 3. Explore parallelisation of tests to reduce total test time > 4. Add a SKIP return value, that tells that the test did not run > (or some other means that allows run-tests.sh to figure out that > a particular test did not run because its known to not work on > current configuration). > > This would avoid false-positive 'all tests passed' when in reality > some test had to 'exit 0' because of a missing feature or lack of > real > root. > > Alternatively we could just make these tests fail and leave it to the > user to figure it out, the normal expectation is for all tests to > pass, > its mostly when run-tests.sh is run on older kernel releases when it > starts acting up. > These are very good points. Parallel execution maybe happen by hooking the tests up as individual make targets. I'd like to integrate tests more into `make check-*`. I'd also like to add unit tests (that is, both statically and dynamically link with libnftables and to be able to unit test code specifically). As a developer, I'd like to type `make check` to run the the unit tests and have a clear make target for the integration tests. IMO "[PATCH nft 0/6] no recursive make" should be done to make that nicer. But I guess that will be controversial. Recursive make seems very popular, all around. Thomas