On Wed, Jul 17, 2013 at 12:50:25PM -0700, Davidlohr Bueso wrote: > From: Davidlohr Bueso <davidlohr.bueso@xxxxxx> > > - Cleaned up and forward ported to Linus' latest. > - Cache aligned mutexes. > - Keep non SMP systems using a single mutex. > > It was found that this mutex can become quite contended > during the early phases of large databases which make use of huge pages - for instance > startup and initial runs. One clear example is a 1.5Gb Oracle database, where lockstat > reports that this mutex can be one of the top 5 most contended locks in the kernel during > the first few minutes: > > hugetlb_instantiation_mutex: 10678 10678 > --------------------------- > hugetlb_instantiation_mutex 10678 [<ffffffff8115e14e>] hugetlb_fault+0x9e/0x340 > --------------------------- > hugetlb_instantiation_mutex 10678 [<ffffffff8115e14e>] hugetlb_fault+0x9e/0x340 > > contentions: 10678 > acquisitions: 99476 > waittime-total: 76888911.01 us Hello, I have a question :) So, each contention takes 7.6 ms in your result. Do you map this area with VM_NORESERVE? If we map with VM_RESERVE, when page fault, we just dequeue a huge page from a queue and clear a page and then map it to a page table. So I guess, it shouldn't take so long. I'm wondering why it takes so long. And do you use 16KB-size hugepage? If so, region handling could takes some times. If you access the area as random order, the number of region can be more than 90000. I guess, this can be one reason to too long waittime. Thanks. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@xxxxxxxxx. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@xxxxxxxxx"> email@xxxxxxxxx </a>