... > > > So the only time I've even seen __vma_adjust() fail is with a fault > > injector failing mas_preallocate() allocations. If it's safe to not > > unwind, I'm happy to drop both unwinds but I was concerned in the path > > of a vma_merge() calling __vma_adjust() and failing out on allocations > > then OOM recovering, leaving a VMA with a 1/2 merged vma with anon > > incorrectly set.. which is an even more unlikely scenario. > > It's not half-merged, it is correctly set up (just like if a write fault > had occurred somewhere in that extent before the merge), so no need to > unwind. > I'll drop the incorrect unwinding then. > ... > > > Right, the __split_vma() never adjusts anything but one side of the > > 'vma' VMA by inserting the 'insert' VMA. This will result in two writes > > to the tree - but one will exactly fit in an existing range which will > > be placed without an allocation via the mas_wr_slot_store() function in > > the maple tree. Exact fits are nice - they are fast. > > I'll have to come back and think about this again later on: "Exact fits > are nice" may answer my concern in the end, but I still have the worry > that the first store destroys the prealloc, when it might be the second > store which needs the prealloc. > > ... > > > > > > Do you have the patch > > > > > "maple_tree-Fix-stale-data-copy-in-mas_wr_node_store.patch"? It sounds > > > > > like your issue fits this fix exactly. I was seeing the same issue with > > > > > gcc 9.3.1 20200408 and this bug doesn't happen for me now. The logs > > > > > you sent also fit the situation. I went through the same exercise > > > > > (exorcism?) of debugging the various additions and removals of the VMA > > > > > only to find the issue in the tree itself. The fix also modified the > > > > > test code to detect the issue - which was actually hit but not detected > > > > > in the existing test cases from a live capture of VMA activities. It is > > > > > difficult to spot in the tree dump as well. I am sure I sent this to > > > > > Andrew as it is included in v11 and did not show up in his diff, but I > > > > > cannot find it on lore, perhaps I forgot to CC you? I've attached it > > > > > here for you in case you missed it. > > > > > > > > Thanks! No, I never received that patch, nor can I see it on lore > > > > or marc.info; but I (still) haven't looked at v11, and don't know > > > > about Andrew's diff. Anyway, sounds exciting, I'm eager to stop > > > > writing this mail and get to testing with that in - but please > > > > let me know whether it's the mas_dead_leaves() or the __vma_adjust() > > > > mods you attached previously, which you want me to leave out. > > The overnight test run ended in an unexpected way, but I believe we can > count it as a success - a big success for your stale data copy fix. > > (If only that fix had got through the mail system on Friday, > my report on Sunday would have been much more optimistic.) > > I said before that I expected the test run to hit the swapops.h > migration entry !PageLocked BUG, but it did not. It ran for > nearly 7 hours, and then one of its builds terminated with > > {standard input}: Assembler messages: > {standard input}: Error: open CFI at the end of file; > missing .cfi_endproc directive > gcc: fatal error: Killed signal terminated program cc1 > compilation terminated. > > which I've never seen before. Usually I'd put something like that down > to a error in swap, or a TLB flushing error (but I include Nadav's fix > in my kernel, and wouldn't get very far without it): neither related to > the maple tree patchset. > > But on this occasion, my guess is that it's actually an example of what > the swapops.h migration entry !PageLocked BUG is trying to alert us to. > > Imagine when such a "stale" migration entry is found, but the page it > points to (now reused for something else) just happens to be PageLocked > at that instant. Then the BUG won't fire, and we proceed to use the > page as if it's ours, but it's not. I think that's what happened. > > I must get on with the day: more testing, and thinking. I think this is the same issue seen here: https://lore.kernel.org/linux-mm/YsQt3IHbJnAhsSWl@xxxxxxxxxxxxxxxxxxxx/ Note that on 20220616, the maple tree was in the next. I suspect I am doing something wrong in do_brk_munmap(). I am using a false VMA to munmap a partial vma by setting it up like the part of the VMA that would have been split, inserted into the tree, then removed and freed. I must be missing something necessary for this to function correctly. Thanks, Liam