Hello all, I have just found out about this qp-trie work, and I'm pleased to hear that it is looking promising for you! I have a few very broad observations: The "q" in qp-trie doesn't have to stand for "quadbit". There's a tradeoff between branch factor, maximum key length, and size of branch node. The greater the branch factor, the fewer indirections needed to traverse the tree; but if you go too wide then prefetching is less effective and branch nodes get bigger. I found that 5 bits was the sweet spot (32 wide bitmap, 30ish bit key length) - indexing 5 bit mouthfuls out of the key is HORRID but it was measurably faster than 4 bits. 6 bits (64 bits of bitmap) grew nodes from 16 bytes to 24 bytes, and it ended up slower. Your interior nodes are much bigger than mine, so you might find the tradeoff is different. I encourage you to try it out. I saw there has been some discussion about locking and RCU. My current project is integrating a qp-trie into BIND, with the aim of replacing its old red-black tree for searching DNS records. It's based on a concurrent qp-trie that I prototyped in NSD (a smaller and simpler DNS server than BIND). My strategy is based on a custom allocator for interior nodes. This has two main effects: * Node references are now 32 bit indexes into the allocator's pool, instead of 64 bit pointers; nodes are 12 bytes instead of 16 bytes. * The allocator supports copy-on-write and safe memory reclamation with a fairly small overhead, 3 x 32 bit counters per memory chunk (each chunk is roughly page sized). I wrote some notes when the design was new, but things have changed since then. https://dotat.at/@/2021-06-23-page-based-gc-for-qp-trie-rcu.html For memory reclamation the interior nodes get moved / compacted. It's a kind of garbage collector, but easy-mode because the per-chunk counters accurately indicate when compaction is worthwhile. I've written some notes on my several failed GC experiments; the last / current attempt seems (by and large) good enough. https://dotat.at/@/2022-06-22-compact-qp.html For exterior / leaf nodes, I'm using atomic refcounts to know when they can be reclaimed. The caller is responsible for COWing its leaves when necessary. Updates to the tree are transactional in style, and do not block readers: a single writer gets the write mutex, makes whatever changes it needs (copying as required), then commits by flipping the tree's root. After a commit it can free unused chunks. (Compaction can be part of an update transaction or a transaction of its own.) I'm currently using a reader-writer lock for the tree root, but I designed it with liburcu in mind, while trying to keep things simple. This strategy is very heavily biased in favour of readers, which suits DNS servers. I don't know enough about BPF to have any idea what kind of update traffic you need to support. At the moment I am reworking and simplifying my transaction and reclamation code and it's all very broken. I guess this isn't the best possible time to compare notes on qp-trie variants, but I'm happy to hear from others who have code and ideas to share. -- Tony Finch <dot@xxxxxxxx> https://dotat.at/ Mull of Kintyre to Ardnamurchan Point: East or southeast 4 to 6, increasing 6 to gale 8 for a time. Smooth or slight in eastern shelter, otherwise slight or moderate. Rain or showers. Good, occasionally poor.