Hi Stefano, A bit of late feedback on your new datastructure. Did you tests with 8-bits grouping instead of 4-bits? Assuming a concatenation of 12 bytes (each field 4 bytes, hence 3 fields): * Using 4-bits groups: the number of buckets is 2^4 = 16 multiplied by the bucket word (assuming one long word, 8 bytes, 64 pipapo rules) is 16 * 8 = 128 bytes per group-row in the looking table. Then, the number of group-rows is 8 given that 32 bits, then 32 / 4 = 8 group-rows. 8 * 128 bytes = 1024 bytes per lookup table. Assuming 3 fields, then this is 1024 * 3 = 3072 bytes. * Using 8-bits groups: 2^8 = 256, then 256 * 8 = 2048 bytes per group-row. Then, 32 / 8 = 4 group-rows in total. 4 * 2048 bytes = 8192 bytes per lookup table. Therefore, 3 * 8192 = 24576 bytes. Still rather small. This is missing the mapping table that links the lookup tables in the memory counting. And I'm assuming that the number of pipapo rules in the lookup table fits into 64-bits bucket long word. Anyway, my understanding is that the more bits you use for grouping, the larger the lookup table becomes. Still, both look very small in terms of memory consumption for these days. I'm just telling this because the C implementation can probably get better numbers at the cost of consuming more memory? Probably do this at some point? BTW, with a few more knobs it should be possible to integrate better this datastructure into the transaction infrastructure, this can be done incrementally. More questions below. On Wed, Jan 22, 2020 at 12:17:55AM +0100, Stefano Brivio wrote: [...] > diff --git a/net/netfilter/nft_set_pipapo.c b/net/netfilter/nft_set_pipapo.c > new file mode 100644 > index 000000000000..f0cb1e13af50 > --- /dev/null > +++ b/net/netfilter/nft_set_pipapo.c [...] > + * > + * rule indices in current field: 0 1 2 > + * map to rules in next field: 0 1 1 > + * > + * the new result bitmap will be 0x02: rule 1 was set, and rule 1 will be > + * set. > + * > + * We can now extend this example to cover the second iteration of the step > + * above (lookup and AND bitmap): assuming the port field is > + * 2048 < 0 0 5 0 >, with starting result bitmap 0x2, and lookup table > + * for "port" field from pre-computation example: > + * > + * :: > + * > + * bucket > + * group 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 > + * 0 0,1 > + * 1 0,1 > + * 2 0 1 > + * 3 0,1 > + * > + * operations are: 0x2 & 0x3 [bucket 0] & 0x3 [bucket 0] & 0x2 [bucket 5] > + * & 0x3 [bucket 0], resulting bitmap is 0x2. > + * > + * - if this is the last field in the set, look up the value from the mapping > + * array corresponding to the final result bitmap > + * > + * Example: 0x2 resulting bitmap from 192.168.1.5:2048, mapping array for > + * last field from insertion example: > + * > + * :: > + * > + * rule indices in last field: 0 1 > + * map to elements: 0x42 0x66 Should this be instead? rule indices in last field: 0 1 map to elements: 0x66 0x42 If the resulting bitmap is 0x2, then this is actually pointing to rule index 1 in this lookup table, that is the 2048. > + * the matching element is at 0x42. Hence, the matching 0x42 element. Otherwise, I don't understand how to interpret the "result bitmap". I thought this contains the matching pipapo rule index that is expressed as a bitmask. [...] > +static int pipapo_refill(unsigned long *map, int len, int rules, > + unsigned long *dst, union nft_pipapo_map_bucket *mt, > + bool match_only) > +{ > + unsigned long bitset; > + int k, ret = -1; > + > + for (k = 0; k < len; k++) { > + bitset = map[k]; > + while (bitset) { > + unsigned long t = bitset & -bitset; > + int r = __builtin_ctzl(bitset); > + int i = k * BITS_PER_LONG + r; > + > + if (unlikely(i >= rules)) { > + map[k] = 0; > + return -1; > + } > + > + if (unlikely(match_only)) { Not a big issue, but this branch holds true for the last field, my understanding for unlikely() is that it should be used for paths where 99.99...% is not going to happen. Not a big issue, just that when reading the code I got confused this is actually likely to happen. > + bitmap_clear(map, i, 1); > + return i; > + } > + > + ret = 0; > + > + bitmap_set(dst, mt[i].to, mt[i].n); > + > + bitset ^= t; > + } > + map[k] = 0; > + } > + > + return ret; > +}>