On Thu, Sep 2, 2021, at 3:53 PM, Andrew Morton wrote: > On Thu, 2 Sep 2021 15:50:03 -0700 Linus Torvalds > <torvalds@xxxxxxxxxxxxxxxxxxxx> wrote: > > > On Thu, Sep 2, 2021 at 3:29 PM Andy Lutomirski <luto@xxxxxxxxxx> wrote: > > > > > > This pile is: > > > > > > Nacked-by: Andy Lutomirski <luto@xxxxxxxxxx> > > > > Can you specify exactly the range you want me to drop? > > > > I assume it's the four patches 117-120, ie > > > > lazy tlb: introduce lazy mm refcount helper functions > > lazy tlb: allow lazy tlb mm refcounting to be configurable > > lazy tlb: shoot lazies, a non-refcounting lazy tlb option > > powerpc/64s: enable MMU_LAZY_TLB_SHOOTDOWN > > > > but I just want to double-check before I do surgery on that series. > > Yes, those 4. > > Sorry, I missed that email thread... > Indeed. If anyone cares, my WIP series is here: https://git.kernel.org/pub/scm/linux/kernel/git/luto/linux.git/log/?h=sched/lazymm It has known bugs and is definitely not ready. The major known problem is that the kthread and execve paths are still not cleaned up. (We have several different code paths that change current->mm, and they're not all quite consistent with each other.) kthread_use_mm() follows its own little set of refcounting rules that is not consistent with the scheduler's expectations, but I think it's just close enough that current kernels will not erroneously free an in-use mm or permanently leak a reference. I have half-written code to consolidate all the ->mm assignments into a single function, but it's not done. The CPU offline issue fixed in that series seems to me like it should also affect Nick's series, but I haven't dug in. I don't immediately see why Nick's series would be able to get away without the same rearrangement I needed. (You can't *shoot* a lazy TLB entry out from under an offlined CPU -- you need to actually get rid of the reference and account for it correctly. Perhaps it's all okay in Nick's series because mmdrop() becomes a no-op and the stale logical reference doesn't actually exist.) --Andy