> I didn't follow this thread. However, as you mentioned MADV_FREE will > make many page fault, I jump into here. > One of the benefit with MADV_FREE in current implementation is to > avoid page fault as well as no zeroing. > Why did you see many page fault? I think I just misunderstood why it was still so much slower than not using purging at all. >> I get ~20k requests/s with jemalloc on the ebizzy benchmark with this >> dual core ivy bridge laptop. It jumps to ~60k requests/s with MADV_FREE >> IIRC, but disabling purging via MALLOC_CONF=lg_dirty_mult:-1 leads to >> 3.5 *million* requests/s. It has a similar impact with TCMalloc. > > When I tested MADV_FREE with ebizzy, I saw similar result two or three > times fater than MADV_DONTNEED. But It's no free cost. It incurs MADV_FREE > cost itself*(ie, enumerating all of page table in the range and clear > dirty bit and tlb flush). Of course, it has mmap_sem with read-side lock. > If you see great improve when you disable purging, I guess mainly it's > caused by no lock of mmap_sem so some threads can allocate while other > threads can do page fault. The reason I think so is I saw similar result > when I implemented vrange syscall which hold mmap_sem read-side lock > during very short time(ie, marking the volatile into vma, ie O(1) while > MADV_FREE holds a lock during enumerating all of pages in the range, ie O(N)) It stops doing mmap after getting warmed up since it never unmaps so I don't think mmap_sem is a contention issue. It could just be caused by the cost of the system call itself and TLB flush. I found perf to be fairly useless in identifying where the time was being spent. It might be much more important to purge very large ranges in one go with MADV_FREE. It's a different direction than the current compromises forced by MADV_DONTNEED.
Attachment:
signature.asc
Description: OpenPGP digital signature