Nadav Amit <namit@xxxxxxxxxx> writes: >> On Aug 23, 2021, at 1:05 AM, Huang, Ying <ying.huang@xxxxxxxxx> wrote: >> >> Hi, Nadav, >> >> Nadav Amit <nadav.amit@xxxxxxxxx> writes: >> >>> From: Nadav Amit <namit@xxxxxxxxxx> >>> >>> flush_tlb_batched_pending() appears to have a theoretical race: >>> tlb_flush_batched is being cleared after the TLB flush, and if in >>> between another core calls set_tlb_ubc_flush_pending() and sets the >>> pending TLB flush indication, this indication might be lost. Holding the >>> page-table lock when SPLIT_LOCK is set cannot eliminate this race. >> >> Recently, when I read the corresponding code, I find the exact same race >> too. Do you still think the race is possible at least in theory? If >> so, why hasn't your fix been merged? > > I think the race is possible. It didn’t get merged, IIRC, due to some > addressable criticism and lack of enthusiasm from other people, and > my laziness/busy-ness. Got it! Thanks your information! >>> The current batched TLB invalidation scheme therefore does not seem >>> viable or easily repairable. >> >> I have some idea to fix this without too much code. If necessary, I >> will send it out. > > Arguably, it would be preferable to have a small back-portable fix for > this issue specifically. Just try to ensure that you do not introduce > performance overheads. Any solution should be clear about its impact > on additional TLB flushes on the worst-case scenario and the number > of additional atomic operations that would be required. Sure. Will do that. Best Regards, Huang, Ying