On Mon, 2019-05-20 at 15:48 -0700, David Miller wrote: > From: "Edgecombe, Rick P" <rick.p.edgecombe@xxxxxxxxx> > Date: Mon, 20 May 2019 22:17:49 +0000 > > > Thanks for testing. So I guess that suggests it's the TLB flush > > causing > > the problem on sparc and not any lazy purge deadlock. I had sent > > Meelis > > another test patch that just flushed the entire 0 to ULONG_MAX > > range to > > try to always the get the "flush all" logic and apprently it didn't > > boot mostly either. It also showed that it's not getting stuck > > anywhere > > in the vm_remove_alias() function. Something just hangs later. > > I wonder if an address is making it to the TLB flush routines which > is > not page aligned. I think vmalloc should force PAGE_SIZE alignment, but will double check nothing got screwed up. > Or a TLB flush is being done before the callsites > are patched properly for the given cpu type. Any idea how I could log when this is done? It looks like it's done really early in boot assembly. This behavior shouldn't happen until modules or BPF are being freed.