On Mon, 16 May 2022 20:16:53 +0200 Pablo Neira Ayuso <pablo@xxxxxxxxxxxxx> wrote: > Hi Stefano, > > On Thu, May 12, 2022 at 08:34:21PM +0200, Stefano Brivio wrote: > > In the overlap detection performed as part of the insertion operation, > > we have to skip nodes that are not active in the current generation or > > expired. This was done by adding several conditions in overlap decision > > clauses, which made it hard to check for correctness, and debug the > > actual issue this patch is fixing. > > > > Consolidate these conditions into a single clause, so that we can > > proceed with debugging and fixing the following issue. > > > > Case b3. described in comments covers the insertion of a start > > element after an existing end, as a leaf. If all the descendants of > > a given element are inactive, functionally, for the purposes of > > overlap detection, we still have to treat this as a leaf, but we don't. > > > > If, in the same transaction, we remove a start element to the right, > > remove an end element to the left, and add a start element to the right > > of an existing, active end element, we don't hit case b3. For example: > > > > - existing intervals: 1-2, 3-5, 6-7 > > - transaction: delete 3-5, insert 4-5 > > > > node traversal might happen as follows: > > - 2 (active end) > > - 5 (inactive end) > > - 3 (inactive start) > > > > when we add 4 as start element, we should simply consider 2 as > > preceding end, and consider it as a leaf, because interval 3-5 has been > > deleted. If we don't, we'll report an overlap because we forgot about > > this. > > > > Add an additional flag which is set as we find an active end, and reset > > it if we find an active start (to the left). If we finish the traversal > > with this flag set, it means we need to functionally consider the > > previous active end as a leaf, and allow insertion instead of reporting > > overlap. > > I can still trigger EEXIST with deletion of existing interval. It > became harder to reproduce after this patch. > > After hitting EEXIST, if I do: > > echo "flush ruleset" > test.nft > nft list ruleset >> test.nft > > to dump the existing ruleset, then I run the delete element command > again to remove the interval and it works. Before this patch I could > reproduce it by reloading an existing ruleset dump. > > I'm running the script that I'm attaching manually, just one manual > invocation after another. Ouch, sorry for that. It looks like there's another case where inactive elements still affect overlap detection in an unexpected way... at least with the structure of this patch it should be easier to find, I'm looking into that now. -- Stefano