Re: [patch 119/212] lazy tlb: shoot lazies, a non-refcounting lazy tlb option

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Excerpts from Andy Lutomirski's message of September 3, 2021 3:11 pm:
> On 9/2/21 5:46 PM, Nicholas Piggin wrote:
>> Excerpts from Andrew Morton's message of September 3, 2021 8:53 am:
>>> On Thu, 2 Sep 2021 15:50:03 -0700 Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx> wrote:
>>>
>>>> On Thu, Sep 2, 2021 at 3:29 PM Andy Lutomirski <luto@xxxxxxxxxx> wrote:
>>>>>
>>>>> This pile is:
>>>>>
>>>>> Nacked-by: Andy Lutomirski <luto@xxxxxxxxxx>
>>>>
>>>> Can you specify exactly the range you want me to drop?
>>>>
>>>> I assume it's the four patches 117-120, ie
>>>>
>>>>   lazy tlb: introduce lazy mm refcount helper functions
>>>>   lazy tlb: allow lazy tlb mm refcounting to be configurable
>>>>   lazy tlb: shoot lazies, a non-refcounting lazy tlb option
>>>>   powerpc/64s: enable MMU_LAZY_TLB_SHOOTDOWN
>>>>
>>>> but I just want to double-check before I do surgery on that series.
>>>
>>> Yes, those 4.
>>>
>>> Sorry, I missed that email thread...
>>>
>> 
>> That's not reasonable. Andy has had complete misunderstandings about the
>> series which seems to stem from x86's horrible hacks that have gone in
>> has confused him.
> 
> The horrible hacks in question are almost exclusively in core code.

No, they're in x86.

> Here's a brief summary of the situation.
> 
> There's a messy interaction between mmget()/mmdrop() and membarrier.
> membarrier currently depends on some mmget() and mmdrop() calls to be
> full barriers.

Membarrier has had (and is improving but still has) some complexity, 
which is caused by interaction with _existing_ lazy-mm code in the tree. 
The complexity is not with lazy-mm itself, and my series does not add
more to the membarrier interaction. So I don't accept the criticism
that it has to do with membarrier complexity.

> You make membarrier keep working by putting an ifdef'd
> smp_mb() in the core scheduler.

Sure, it's well commented and replaces the smp_mb provided by atomic 
operation that membarrier relied on to an explicit one. That's not a
horrible hack.

> I clean up the code to make it work
> independently of smp_mb() and therefore save the cost of the
> unconditional barrier for non-membarrier-using programs.

Great. Nothing to do with this series though which is not changing 
membarrier ordering.

I can certainly help you rebase it on top of these patches if you need.

> 
> Your series adds an option MMU_LAZY_TLB_REFCOUNT=n for architectures to
> opt out of lazy TLB refcounting.  This is simply wrong.  Right now, the
> core scheduler provides current->active_mm and guarantees that
> current->active_mm always points to a live (possibly mm_users == 0 but
> definitely not freed) mm_struct.  With MMU_LAZY_TLB_REFCOUNT=n,
> current->active_mm still exists, is still updated, but may point to
> freed memory.

Wrong. It does nothing of the sort. I told you this in the previous 
discussion, you obviously ignored me. You are just wrong, and you can't
actually point to where this happens.

This criticism is invalid too.

> I consider this unacceptable.  A comment says "This can
> be disabled if the architecture ensures no CPUs are using an mm as a
> "lazy tlb" beyond its final refcount" -- that's nice, but saying "well,
> if you this, you have to make sure you don't accidentally dereference
> that juicy dangling pointer we give you" is, in my book, a poor
> justification.

It's not a justification, it's a recipe for other archs which might want 
ot implement it!

> 
> I have no particular objection to the actual shoot lazies part, except
> insofar as I think we can do even better (e.g. my patch).  But 90% of
> the complexity of my WIP series is cleaning up the mess so that we can
> have a maintainable lazy mm mechanism instead of expanding the current
> hard-to-maintain part into three separate possible modes.

What actually happened is that when you ran out of (incorrect) technical 
disputes like this confusion about the active_mm thing, you then started 
to demand that I massively rework core code that you don't understand so 
that it matches the horrible mess that x86 has got itself into. I can
bring up quotes from previous threads.

> 
> Maybe I'm holding my own patches to an excessively high standard.
> 
> 
> 
>> 
>> My series doesn't affect x86 at all and it's no reason why Andy's series
>> to improve x86 can't be merged later. But that half finished series he 
>> keeps threatening with has been sitting there for almost a year now and 
>> it's gone nowhere, while there have been no unresolved technical 
>> objections to mine, it works, it's simple and small.
> 
> My series barely touches x86.

I'm talking about your previous insistence that my patch series removed 
"active_mm" from core code, because it doesn't match x86 internals, and 
similar such stupidity.

> The only "hack" is that x86 may have a
> CPU that has ->mm == NULL, ->active_mm != NULL, CR3 pointing to the init
> pgd, and mm_cpumask clear.  I don't see why this is a problem other than
> being somewhat unusual.  But x86 bare metal, like every architecture
> that can only flush the TLB using an IPI, can very efficiently shoot
> lazies, since it shoots the lazies anyway when tearing down pagetables,
> but actually enabling the config option with this series applied will
> result in ->active_mm pointing to freed memory.  Ick.
> 
>> 
>> I've kept trying to offer to help Andy with reviewing his stuff or fix 
>> the horrible x86 hacks, but nothing.
> 
> I haven't finished it yet.  Sorry.
> 

No need to be sorry about that, it will be trivial to rebase on top of 
my series, I've even done a quick attempt. No problem at all.

Thanks,
Nick




[Index of Archives]     [Kernel Archive]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]

  Powered by Linux