On Thu, Oct 28, 2021 at 03:56:47PM +0200, Sebastian Andrzej Siewior wrote: > On 2021-10-28 13:52:24 [+0100], Mel Gorman wrote: > > > Yes that was my question. So if you have "always", do mlock_all() in the > > > application and then have other threads that same application doing > > > malloc/ free of memory that the RT thread is not touching then bad > > > things can still happen, right? > > > My understanding is that all threads can be blocked in a page fault if > > > there is some THP operation going on. > > > > > > > Hmm, it could happen if all the memory used by the RT thread was not > > hugepage-aligned and potentially khugepaged could interfere. khugepaged > > can be disabled if tuned properly but the alignment requirement would be > > tricky. Probably safer to just disable it like it has been historically. > > For consistently, force NUMA_BALANCING to be disabled too because it > > introduces non-deterministic latencies even if memory regions are locked > > and bound. > > Okay. I don't mind disabling it or keeping it enabled under some > restrictions. I just need it to document it so people are aware why it > is disabled so if they want to enable they know what the areas that need > attention. > > THP disable due to alignment issues and potential defragmentation by > khugepaged. Understood. Workaround: Use hugepages. > > NUMA_BALANCING. It looks like it replaces the physical page while > keeping the virtual address. This kind of page migration does not look > good if it happens for everyone since it involves mmap_lock. > Let me write that up and post properly. > In case it helps; TRANSPARENT_HUGEPAGE: There are potential non-determinstic delays to an RT thread if a critical memory region is not THP-aligned and a non-RT buffer is located in the same hugepage-aligned region. It's also possible for an unrelated thread to migrate pages belonging to an RT task incurring unexpected page faults due to memory defragmentation even if khugepaged is disabled. NUMA_BALANCING: There is a non-determinstic delay to mark PTEs PROT_NONE to gather NUMA fault samples, increased page faults of regions even if mlocked and non-deterministic delays when migrating pages. -- Mel Gorman SUSE Labs