On Wed, Nov 13, 2024 at 12:01:07PM +0100, Pablo Neira Ayuso wrote: > Hi Phil, > > On Tue, Nov 12, 2024 at 09:52:35PM +0100, Phil Sutter wrote: > > Hi Pablo, > > > > On Thu, Oct 31, 2024 at 11:04:11PM +0100, Pablo Neira Ayuso wrote: > > > Side note: While profiling, I can still see lots json objects, this > > > results in memory consumption that is 5 times than native > > > representation. Error reporting is also lagging behind, it should be > > > possible to add a json_t pointer to struct location to relate > > > expressions and json objects. > > > > I can't quite reproduce this. When restoring a ruleset with ~12.7k > > elements in individual standard syntax commands, valgrind prints: > > > > | HEAP SUMMARY: > > | in use at exit: 59,802 bytes in 582 blocks > > | total heap usage: 954,970 allocs, > > | 954,388 frees, > > | 18,300,874 bytes allocated > > > > Repeating the same in JSON syntax, I get: > > > > | HEAP SUMMARY: > > | in use at exit: 61,592 bytes in 647 blocks > > | total heap usage: 1,200,164 allocs, > > | 1,199,517 frees, > > | 38,612,257 bytes allocated > > > > So this is 38MB vs 18MB? At least far from the mentioned 5 times. Would > > you mind sharing how you got to that number? > > > > Please kindly find my reproducers attached for reference. > > I am using valgrind --tool=massif to measure memory consumption in > userspace. > > I used these two files: > > - set-init.json-nft, to create the table and set. > - set-65535.nft-json, to create a small set with 64K elements. > > then I run: > > valgrind --tool=massif nft -f set-65535.nft-json > > there is a tool: > > ms_print massif.out.XYZ Thanks! I see it now. Interestingly, I had tried feeding the ruleset on stdin and that makes standard syntax use more memory, as well. With the rulesets being read from a file, standard syntax indeed requires just 7MB while JSON uses 35MB. > At "peak time" in heap memory consumption, I can see 60% is consumed > in json objects. The problem with jansson in that regard is that it parses the whole thing recursively. In theory it would be possible to parse just the outer object and continue parsing array elements by the time they are accessed. Interestingly, I managed to reduce memory consumption by 30% by inserting a json_decref() call here: | @@ -3496,6 +3498,7 @@ static struct cmd *json_parse_cmd_add_element(struct json_ctx *ctx, | h.set.name = xstrdup(h.set.name); | | expr = json_parse_set_expr(ctx, "elem", tmp); | + json_decref(tmp); | if (!expr) { | json_error(ctx, "Invalid set."); | handle_free(&h); This does not fix a memleak, though: 'tmp' is assigned by a call to json_unpack(... "s:o" ...) and thus does not have its reference incremented. So AIUI, we're causing parts of the JSON object tree to be freed and later accesses are problematic: e.g. --echo mode will abort with "corrupted double-linked list" error. > I am looking at the commands and expressions to reduce memory > consumption there. The result of that work will also help json > support. Cheers, Phil