Excerpts from Andy Lutomirski's message of December 29, 2020 10:36 am: > On Mon, Dec 28, 2020 at 4:11 PM Nicholas Piggin <npiggin@xxxxxxxxx> wrote: >> >> Excerpts from Andy Lutomirski's message of December 28, 2020 4:28 am: >> > The old sync_core_before_usermode() comments said that a non-icache-syncing >> > return-to-usermode instruction is x86-specific and that all other >> > architectures automatically notice cross-modified code on return to >> > userspace. Based on my general understanding of how CPUs work and based on >> > my atttempt to read the ARM manual, this is not true at all. In fact, x86 >> > seems to be a bit of an anomaly in the other direction: x86's IRET is >> > unusually heavyweight for a return-to-usermode instruction. >> >> "sync_core_before_usermode" as I've said says nothing to arch, or to the >> scheduler, or to membarrier. > > Agreed. My patch tries to fix this. I agree that the name is bad and > could be improved further. We should define what > membarrier(...SYNC_CORE) actually does and have arch hooks to make it > happen. > >> > So let's drop any pretense that we can have a generic way implementation >> > behind membarrier's SYNC_CORE flush and require all architectures that opt >> > in to supply their own. This means x86, arm64, and powerpc for now. Let's >> > also rename the function from sync_core_before_usermode() to >> > membarrier_sync_core_before_usermode() because the precise flushing details >> > may very well be specific to membarrier, and even the concept of >> > "sync_core" in the kernel is mostly an x86-ism. >> >> The concept of "sync_core" (x86: serializing instruction, powerpc: context >> synchronizing instruction, etc) is not an x86-ism at all. x86 just wanted >> to add a serializing instruction to generic code so it grew this nasty API, >> but the concept applies broadly. > > I mean that the mapping from the name "sync_core" to its semantics is > x86 only. The string "sync_core" appears in the kernel only in > arch/x86, membarrier code, membarrier docs, and a single SGI driver > that is x86-only. Sure, the idea of serializing things is fairly > generic, but exactly what operations serialize what, when things need > serialization, etc is quite architecture specific. Okay, well yes it's x86 only in name, I was more talking about the concept. > Heck, on 486 you serialize the instruction stream with JMP. x86-specific aside, I did think the semantics of a "serializing instruction" was reasonably well architected in x86. Sure it could do other things as well, but if you executed a serializing instruction, then you had a decent set of guarantees (e.g., what you might want for code modification). > >> > +static inline void membarrier_sync_core_before_usermode(void) >> > +{ >> > + /* >> > + * XXX: I know basically nothing about powerpc cache management. >> > + * Is this correct? >> > + */ >> > + isync(); >> >> This is not about memory ordering or cache management, it's about >> pipeline management. Powerpc's return to user mode serializes the >> CPU (aka the hardware thread, _not_ the core; another wrongness of >> the name, but AFAIKS the HW thread is what is required for >> membarrier). So this is wrong, powerpc needs nothing here. > > Fair enough. I'm happy to defer to you on the powerpc details. In > any case, this just illustrates that we need feedback from a person > who knows more about ARM64 than I do. > Thanks, Nick