Re: [RFC 2/2] khugepaged: use upgrade_read() to optimize collapse_huge_page

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, 17 Oct 2024 14:20:12 +0100, willy@xxxxxxxxxxxxx wrote:

> On Thu, Oct 17, 2024 at 02:18:41PM +0800, lizhe.67@xxxxxxxxxxxxx wrote:
> > On Wed, 16 Oct 2024 12:53:15 +0100, willy@xxxxxxxxxxxxx wrote:
> > 
> > >On Wed, Oct 16, 2024 at 12:36:00PM +0800, lizhe.67@xxxxxxxxxxxxx wrote:
> > >> From: Li Zhe <lizhe.67@xxxxxxxxxxxxx>
> > >> 
> > >> In function collapse_huge_page(), we drop mmap read lock and get
> > >> mmap write lock to prevent most accesses to pagetables. There is
> > >> a small time window to allow other tasks to acquire the mmap lock.
> > >> With the use of upgrade_read(), we don't need to check vma and pmd
> > >> again in most cases.
> > >
> > >This is clearly a performance optimisation.  So you must have some
> > >numebrs that justify this, please include them.
> > 
> > Yes, I will add the relevant data to v2 patch.
> 
> How about telling us all now so we know whether to continue discussing
> this?

In my test environment, function collapse_huge_page() only achieved a 0.25%
performance improvement. I use ftrace to get the execution time of
collapse_huge_page(). The test code and test command are as follows.

(1) Test result:

			average execution time of collapse_huge_page()
before this patch: 		1611.06283 us
after this patch:               1597.01474 us

(2) Test code:

#define MMAP_SIZE (2ul*1024*1024)
#define ALIGN(x, mask)  (((x) + ((mask)-1)) & ~((mask)-1))
int main(void)
{
	int num = 100;
	size_t page_sz = getpagesize();
	while (num--) {
		size_t index;
		unsigned char *p_map;
		unsigned char *p_map_real;
		p_map = (unsigned char *)mmap(0, 2 * MMAP_SIZE, PROT_READ | PROT_WRITE, MAP_PRIVATE|MAP_ANON, -1, 0); 
		if (p_map == MAP_FAILED) { 
			printf("mmap fail\n"); 
			return -1;
		} else {
			p_map_real = (char *)ALIGN((unsigned long)p_map, MMAP_SIZE);
			printf("mmap get %p, align to %p\n", p_map, p_map_real);
		}
		for(index = 0; index < MMAP_SIZE; index += page_sz)
			p_map_real[index] = 6;
		int ret = madvise(p_map_real, MMAP_SIZE, 25);
		printf("ret is %d\n", ret);
		munmap(p_map, 2 * MMAP_SIZE); 
	}
	return 0;
}

(3) Test command:
echo never > /sys/kernel/mm/transparent_hugepage/enabled
gcc test.c -o test
trace-cmd record -p function_graph -g collapse_huge_page --max-graph-depth 1 ./test

The optimization of the function collapse_huge_page() seems insignificant.
I am not sure whether it will have a more obvious optimization effect in
other scenarios.




[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux