On Fri, Nov 17, 2023 at 06:23:35PM +0100, Thomas Haller wrote: > On Fri, 2023-11-17 at 18:11 +0100, Phil Sutter wrote: > > On Fri, Nov 17, 2023 at 06:06:16PM +0100, Thomas Haller wrote: > > > On Fri, 2023-11-17 at 17:57 +0100, Phil Sutter wrote: > > > > On Fri, Nov 17, 2023 at 05:36:23PM +0100, Pablo Neira Ayuso > > > > wrote: > > > > > On Fri, Nov 17, 2023 at 05:16:02PM +0100, Thomas Haller wrote: > > > > > > On Fri, 2023-11-17 at 00:00 +0100, Florian Westphal wrote: > > > > > > > Pablo Neira Ayuso <pablo@xxxxxxxxxxxxx> wrote: > > > > > > > > Hi Thomas, > > > > > > > > > > > > > > > > On Wed, Nov 15, 2023 at 01:36:40PM +0100, Thomas Haller > > > > > > > > wrote: > > > > > > > > > On Wed, 2023-11-15 at 13:30 +0100, Pablo Neira Ayuso > > > > > > > > > wrote: > > > > > > > > [...] > > > > > > > > > > I see _lots_ of DUMP FAIL with kernel 5.4 > > > > > > > > > > > > > > > > > > Hi, > > > > > > > > > > > > > > > > > > Could you provide more details? > > > > > > > > > > > > > > > > > > For example, > > > > > > > > > > > > > > > > > > make -j && ./tests/shell/run-tests.sh > > > > > > > > > tests/shell/testcases/include/0007glob_double_0 -x > > > > > > > > > grep ^ -a -R /tmp/nft-test.latest.*/ > > > > > > > > > > > > > > > > # cat [...]/ruleset-diff.json > > > > > > > > --- testcases/include/dumps/0007glob_double_0.json-nft > > > > > > > > 2023- > > > > > > > > 11-15 > > > > > > > > 13:27:20.272084254 +0100 > > > > > > > > +++ /tmp/nft-test.20231116-170617.584.lrZzMy/test- > > > > > > > > testcases- > > > > > > > > include-0007glob_double_0.1/ruleset-after.json 2023- > > > > > > > > 11- > > > > > > > > 16 > > > > > > > > 17:06:18.332535411 +0100 > > > > > > > > @@ -1 +1 @@ > > > > > > > > -{"nftables": [{"metainfo": {"version": "VERSION", > > > > > > > > "release_name": > > > > > > > > "RELEASE_NAME", "json_schema_version": 1}}, {"table": > > > > > > > > {"family": > > > > > > > > "ip", "name": "x", "handle": 1}}, {"table": {"family": > > > > > > > > "ip", > > > > > > > > "name": "y", "handle": 2}}]} > > > > > > > > +{"nftables": [{"metainfo": {"version": "VERSION", > > > > > > > > "release_name": > > > > > > > > "RELEASE_NAME", "json_schema_version": 1}}, {"table": > > > > > > > > {"family": > > > > > > > > "ip", "name": "x", "handle": 158}}, {"table": {"family": > > > > > > > > "ip", > > > > > > > > "name": "y", "handle": 159}}]} > > > > > > > > > > > > > > > > It seems that handles are a problem in this diff. > > > > > > > > > > > > > > Are you running tests with -s option? > > > > > > > > > > > > > > In that case, modules are removed after each test. > > > > > > > > > > > > > > I suspect its because we can then hit -EAGAIN mid- > > > > > > > transaction > > > > > > > because module is missing (again), then replay logic does > > > > > > > its > > > > > > > thing. > > > > > > > > > > > > > > But the handle generator isn't transaction aware, > > > > > > > so it has advanced vs. the aborted partial transaction. > > > > > > > > > > > > > I'm not sure what to do here. > > > > > > > > > > > > a combination of: > > > > > > > > > > > > a) make an effort, that kernel behavior is consistent and > > > > > > reproducible. > > > > > > Stable output seems important to me, and the automatic > > > > > > loading of > > > > > > a > > > > > > kernel module should not make a difference. This is IMO a > > > > > > bug. > > > > > > > > > > This is not a bug in the kernel. The kernel guarantees that the > > > > > handle > > > > > is unique, but the handle allocation strategy is up to the > > > > > kernel. > > > > > Userspace cannot forecast what handle will get, such thing > > > > > might > > > > > lead > > > > > to easy to break assumptions from userspace. > > > > > > > > > > > b) let `nft -j list ruleset` honor (the lack of) `--handle` > > > > > > option and > > > > > > not print those handles. That bugfix would change behavior, > > > > > > so > > > > > > maybe > > > > > > instead add a "--no-handle" option for `nft -j` dumps. > > > > > > > > > > > > > > > > > > Will honoring -a/--handle break firewalld? I think it is the > > > > > main > > > > > user > > > > > of the JSON API. That might help disentangle if this makes > > > > > sense or > > > > > not and what the chances of breaking third party applications > > > > > are. > > > > > > > > > > I'd prefer not to see a --no-handle that will only work for > > > > > JSON > > > > > and > > > > > that is only useful for this test infrastructure (noone else > > > > > asked > > > > > for > > > > > this). > > > > > > > > > > > c) sanitize the output with the sed command (my other mail). > > > > > > > > > > > > This also means, that the .json-nft dumps won't work, if you > > > > > > run > > > > > > without `unshare`. IMO, the mode without unshare should not > > > > > > be > > > > > > supported anymore. But if it's deemed important, then it > > > > > > requires > > > > > > b) or > > > > > > c) or detect the case and skip the diffs with .json-nft. > > > > > > > > What is the problem without unshare? Looking at your patch, it > > > > seems > > > > possible to drop the handle attributes in json-sanitize- > > > > ruleset.sh. > > > > > > Yes, (b) would suffice. I said "or" :) > > > > > > No further problem, but without-unshare seems not a useful thing to > > > support. The test-run takes significantly longer, interferes with > > > the > > > caller's netns and requires CAP_NET_ADMIN. > > > > No, I was wondering why with option (c) "This also means, that the > > .json-nft dumps won't work, if you run without `unshare`." > > > > Because I vote for that option. ;) > > Yes, sorry. I got confused with my own numbering :) > > I meant also c) > > > > > > > > > > > a) is no-go (kernel update to make test infrastructure or to > > > > > allow > > > > > userspace application to make fragile assumptions on how > > > > > handles > > > > > are > > > > > allocated is not correct). > > > > > > > > > > b) needs to evaluated, you maintain firewalld, let us know. > > > > > > > > Given the inherent importance of the handle value for ruleset > > > > manipulations, I assume *any* application will need to be updated > > > > to > > > > pass --handle (or the libnftables-equivalent) to remain > > > > functional. > > > > > > Right. So a "--no-handle" / NFT_CTX_OUTPUT_NO_HANDLE flag for JSON > > > output? > > > > Should not be needed. IIUC, the test infrastructure you're about to > > introduce sanitizes the JSON output already anyway, right? > > Right. c) alone may very well suffice. > > I just sent a patch to that amount. > > > I still think that `nft -j` ignoring the lack of "--no-handle" / > NFT_CTX_OUTPUT_NO_HANDLE is a bug. At the very last a documentation > bug. It is per design. Same with --numeric. JSON formatting is meant for programmatic consumption, no point in increasing readability. I don't see a reason why one would not want the handle attribute included in dumps apart from your use-case and there is a solution at hand. See for instance how nft-test.py strips the handle attribute when comparing JSON output against the record or creating *.json.got files for missing records. Cheers, Phil