On Fri, 2023-11-17 at 17:57 +0100, Phil Sutter wrote: > On Fri, Nov 17, 2023 at 05:36:23PM +0100, Pablo Neira Ayuso wrote: > > On Fri, Nov 17, 2023 at 05:16:02PM +0100, Thomas Haller wrote: > > > On Fri, 2023-11-17 at 00:00 +0100, Florian Westphal wrote: > > > > Pablo Neira Ayuso <pablo@xxxxxxxxxxxxx> wrote: > > > > > Hi Thomas, > > > > > > > > > > On Wed, Nov 15, 2023 at 01:36:40PM +0100, Thomas Haller > > > > > wrote: > > > > > > On Wed, 2023-11-15 at 13:30 +0100, Pablo Neira Ayuso wrote: > > > > > [...] > > > > > > > I see _lots_ of DUMP FAIL with kernel 5.4 > > > > > > > > > > > > Hi, > > > > > > > > > > > > Could you provide more details? > > > > > > > > > > > > For example, > > > > > > > > > > > > make -j && ./tests/shell/run-tests.sh > > > > > > tests/shell/testcases/include/0007glob_double_0 -x > > > > > > grep ^ -a -R /tmp/nft-test.latest.*/ > > > > > > > > > > # cat [...]/ruleset-diff.json > > > > > --- testcases/include/dumps/0007glob_double_0.json-nft 2023- > > > > > 11-15 > > > > > 13:27:20.272084254 +0100 > > > > > +++ /tmp/nft-test.20231116-170617.584.lrZzMy/test-testcases- > > > > > include-0007glob_double_0.1/ruleset-after.json 2023-11- > > > > > 16 > > > > > 17:06:18.332535411 +0100 > > > > > @@ -1 +1 @@ > > > > > -{"nftables": [{"metainfo": {"version": "VERSION", > > > > > "release_name": > > > > > "RELEASE_NAME", "json_schema_version": 1}}, {"table": > > > > > {"family": > > > > > "ip", "name": "x", "handle": 1}}, {"table": {"family": "ip", > > > > > "name": "y", "handle": 2}}]} > > > > > +{"nftables": [{"metainfo": {"version": "VERSION", > > > > > "release_name": > > > > > "RELEASE_NAME", "json_schema_version": 1}}, {"table": > > > > > {"family": > > > > > "ip", "name": "x", "handle": 158}}, {"table": {"family": > > > > > "ip", > > > > > "name": "y", "handle": 159}}]} > > > > > > > > > > It seems that handles are a problem in this diff. > > > > > > > > Are you running tests with -s option? > > > > > > > > In that case, modules are removed after each test. > > > > > > > > I suspect its because we can then hit -EAGAIN mid-transaction > > > > because module is missing (again), then replay logic does its > > > > thing. > > > > > > > > But the handle generator isn't transaction aware, > > > > so it has advanced vs. the aborted partial transaction. > > > > > > > I'm not sure what to do here. > > > > > > a combination of: > > > > > > a) make an effort, that kernel behavior is consistent and > > > reproducible. > > > Stable output seems important to me, and the automatic loading of > > > a > > > kernel module should not make a difference. This is IMO a bug. > > > > This is not a bug in the kernel. The kernel guarantees that the > > handle > > is unique, but the handle allocation strategy is up to the kernel. > > Userspace cannot forecast what handle will get, such thing might > > lead > > to easy to break assumptions from userspace. > > > > > b) let `nft -j list ruleset` honor (the lack of) `--handle` > > > option and > > > not print those handles. That bugfix would change behavior, so > > > maybe > > > instead add a "--no-handle" option for `nft -j` dumps. > > > > > > Will honoring -a/--handle break firewalld? I think it is the main > > user > > of the JSON API. That might help disentangle if this makes sense or > > not and what the chances of breaking third party applications are. > > > > I'd prefer not to see a --no-handle that will only work for JSON > > and > > that is only useful for this test infrastructure (noone else asked > > for > > this). > > > > > c) sanitize the output with the sed command (my other mail). > > > > > > This also means, that the .json-nft dumps won't work, if you run > > > without `unshare`. IMO, the mode without unshare should not be > > > supported anymore. But if it's deemed important, then it requires > > > b) or > > > c) or detect the case and skip the diffs with .json-nft. > > What is the problem without unshare? Looking at your patch, it seems > possible to drop the handle attributes in json-sanitize-ruleset.sh. Yes, (b) would suffice. I said "or" :) No further problem, but without-unshare seems not a useful thing to support. The test-run takes significantly longer, interferes with the caller's netns and requires CAP_NET_ADMIN. > > > a) is no-go (kernel update to make test infrastructure or to allow > > userspace application to make fragile assumptions on how handles > > are > > allocated is not correct). > > > > b) needs to evaluated, you maintain firewalld, let us know. > > Given the inherent importance of the handle value for ruleset > manipulations, I assume *any* application will need to be updated to > pass --handle (or the libnftables-equivalent) to remain functional. Right. So a "--no-handle" / NFT_CTX_OUTPUT_NO_HANDLE flag for JSON output? Thomas