On Fri, Jan 17, 2025 at 04:25:45PM +0100, Valentin Schneider wrote: > On 14/01/25 19:16, Jann Horn wrote: > > On Tue, Jan 14, 2025 at 6:51 PM Valentin Schneider <vschneid@xxxxxxxxxx> wrote: > >> vunmap()'s issued from housekeeping CPUs are a relatively common source of > >> interference for isolated NOHZ_FULL CPUs, as they are hit by the > >> flush_tlb_kernel_range() IPIs. > >> > >> Given that CPUs executing in userspace do not access data in the vmalloc > >> range, these IPIs could be deferred until their next kernel entry. > >> > >> Deferral vs early entry danger zone > >> =================================== > >> > >> This requires a guarantee that nothing in the vmalloc range can be vunmap'd > >> and then accessed in early entry code. > > > > In other words, it needs a guarantee that no vmalloc allocations that > > have been created in the vmalloc region while the CPU was idle can > > then be accessed during early entry, right? > > I'm not sure if that would be a problem (not an mm expert, please do > correct me) - looking at vmap_pages_range(), flush_cache_vmap() isn't > deferred anyway. > > So after vmapping something, I wouldn't expect isolated CPUs to have > invalid TLB entries for the newly vmapped page. > > However, upon vunmap'ing something, the TLB flush is deferred, and thus > stale TLB entries can and will remain on isolated CPUs, up until they > execute the deferred flush themselves (IOW for the entire duration of the > "danger zone"). > > Does that make sense? > Probably i am missing something and need to have a look at your patches, but how do you guarantee that no-one map same are that you defer for TLB flushing? As noted by Jann, we already defer a TLB flushing by backing freed areas until certain threshold and just after we cross it we do a flush. -- Uladzislau Rezki