On Fri 19-01-24 10:03:05, Lance Yang wrote: > Hey Michal, > > Thanks for taking the time to review! > > On Thu, Jan 18, 2024 at 9:40 PM Michal Hocko <mhocko@xxxxxxxx> wrote: > > > > On Thu 18-01-24 20:03:46, Lance Yang wrote: > > [...] > > > > before we discuss the semantic, let's focus on the usecase. > > > > > Use Cases > > > > > > An immediate user of this new functionality is the Go runtime heap allocator > > > that manages memory in hugepage-sized chunks. In the past, whether it was a > > > newly allocated chunk through mmap() or a reused chunk released by > > > madvise(MADV_DONTNEED), the allocator attempted to eagerly back memory with > > > huge pages using madvise(MADV_HUGEPAGE)[2] and madvise(MADV_COLLAPSE)[3] > > > respectively. However, both approaches resulted in performance issues; for > > > both scenarios, there could be entries into direct reclaim and/or compaction, > > > leading to unpredictable stalls[4]. Now, the allocator can confidently use > > > process_madvise(MADV_F_COLLAPSE_LIGHT) to attempt the allocation of huge pages. > > > > IIUC the primary reason is the cost of the huge page allocation which > > can be really high if the memory is heavily fragmented and it is called > > synchronously from the process directly, correct? Can that be worked > > Yes, that's correct. > > > around by process_madvise and performing the operation from a different > > context? Are there any other reasons to have a different mode? > > In latency-sensitive scenarios, some applications aim to enhance performance > by utilizing huge pages as much as possible. At the same time, in case of > allocation failure, they prefer a quick return without triggering direct memory > reclamation and compaction. Could you elaborate some more on why? > > I mean I can think of a more relaxed (opportunistic) MADV_COLLAPSE - > > e.g. non blocking one to make sure that the caller doesn't really block > > on resource contention (be it locks or memory availability) because that > > matches our non-blocking interface in other areas but having a LIGHT > > operation sounds really vague and the exact semantic would be > > implementation specific and might change over time. Non-blocking has a > > clear semantic but it is not really clear whether that is what you > > really need/want. > > Could you provide me with some suggestions regarding the naming of a > more relaxed (opportunistic) MADV_COLLAPSE? Naming is not all that important at this stage (it could be MADV_COLLAPSE_NOBLOCK for example). The primary question is whether non-blocking in general is the desired behavior or the implementation should try but not too hard. -- Michal Hocko SUSE Labs