On Fri, Nov 17, 2023 at 05:16:02PM +0100, Thomas Haller wrote: > On Fri, 2023-11-17 at 00:00 +0100, Florian Westphal wrote: > > Pablo Neira Ayuso <pablo@xxxxxxxxxxxxx> wrote: > > > Hi Thomas, > > > > > > On Wed, Nov 15, 2023 at 01:36:40PM +0100, Thomas Haller wrote: > > > > On Wed, 2023-11-15 at 13:30 +0100, Pablo Neira Ayuso wrote: > > > [...] > > > > > I see _lots_ of DUMP FAIL with kernel 5.4 > > > > > > > > Hi, > > > > > > > > Could you provide more details? > > > > > > > > For example, > > > > > > > > make -j && ./tests/shell/run-tests.sh > > > > tests/shell/testcases/include/0007glob_double_0 -x > > > > grep ^ -a -R /tmp/nft-test.latest.*/ > > > > > > # cat [...]/ruleset-diff.json > > > --- testcases/include/dumps/0007glob_double_0.json-nft 2023-11-15 > > > 13:27:20.272084254 +0100 > > > +++ /tmp/nft-test.20231116-170617.584.lrZzMy/test-testcases- > > > include-0007glob_double_0.1/ruleset-after.json 2023-11-16 > > > 17:06:18.332535411 +0100 > > > @@ -1 +1 @@ > > > -{"nftables": [{"metainfo": {"version": "VERSION", "release_name": > > > "RELEASE_NAME", "json_schema_version": 1}}, {"table": {"family": > > > "ip", "name": "x", "handle": 1}}, {"table": {"family": "ip", > > > "name": "y", "handle": 2}}]} > > > +{"nftables": [{"metainfo": {"version": "VERSION", "release_name": > > > "RELEASE_NAME", "json_schema_version": 1}}, {"table": {"family": > > > "ip", "name": "x", "handle": 158}}, {"table": {"family": "ip", > > > "name": "y", "handle": 159}}]} > > > > > > It seems that handles are a problem in this diff. > > > > Are you running tests with -s option? > > > > In that case, modules are removed after each test. > > > > I suspect its because we can then hit -EAGAIN mid-transaction > > because module is missing (again), then replay logic does its > > thing. > > > > But the handle generator isn't transaction aware, > > so it has advanced vs. the aborted partial transaction. > > > I'm not sure what to do here. > > a combination of: > > a) make an effort, that kernel behavior is consistent and reproducible. > Stable output seems important to me, and the automatic loading of a > kernel module should not make a difference. This is IMO a bug. This is not a bug in the kernel. The kernel guarantees that the handle is unique, but the handle allocation strategy is up to the kernel. Userspace cannot forecast what handle will get, such thing might lead to easy to break assumptions from userspace. > b) let `nft -j list ruleset` honor (the lack of) `--handle` option and > not print those handles. That bugfix would change behavior, so maybe > instead add a "--no-handle" option for `nft -j` dumps. Will honoring -a/--handle break firewalld? I think it is the main user of the JSON API. That might help disentangle if this makes sense or not and what the chances of breaking third party applications are. I'd prefer not to see a --no-handle that will only work for JSON and that is only useful for this test infrastructure (noone else asked for this). > c) sanitize the output with the sed command (my other mail). > > This also means, that the .json-nft dumps won't work, if you run > without `unshare`. IMO, the mode without unshare should not be > supported anymore. But if it's deemed important, then it requires b) or > c) or detect the case and skip the diffs with .json-nft. a) is no-go (kernel update to make test infrastructure or to allow userspace application to make fragile assumptions on how handles are allocated is not correct). b) needs to evaluated, you maintain firewalld, let us know.