From: jun qian <qianjun.kernel@xxxxxxxxx> In our project, Many business delays come from fork, so we started looking for the reason why fork is time-consuming. I used the ftrace with function_graph to trace the fork, found that the vm_normal_page will be called tens of thousands and the execution time of this vm_normal_page function is only a few nanoseconds. And the vm_normal_page is not a inline function. So I think if the function is inline style, it maybe reduce the call time overhead. I did the following experiment: I have wrote the c test code, pls ignore the memory leak :) Before fork, I will malloc 4G bytes, then acculate the fork time. int main() { char *p; unsigned long long i=0; float time_use=0; struct timeval start; struct timeval end; for(i=0; i<LEN; i++) { p = (char *)malloc(4096); if (p == NULL) { printf("malloc failed!\n"); return 0; } p[0] = 0x55; } gettimeofday(&start,NULL); fork(); gettimeofday(&end,NULL); time_use=(end.tv_sec * 1000000 + end.tv_usec) - (start.tv_sec * 1000000 + start.tv_usec); printf("time_use is %.10f us\n",time_use); return 0; } We need to compare the changes in the size of vmlinux, the time of fork in inline and non-inline cases, and the vm_normal_page will be called in many function. So we also need to compare this function's size. For examples, the do_wp_page will call vm_normal_page, so I also calculated it's size. inline non-inline diff vmlinux size 9709248 bytes 9709824 bytes -576 bytes fork time 23475ns 24638ns -4.7% do_wp_page size 972 743 +229 According to the above test data, I think inline vm_normal_page can reduce fork execution time. Signed-off-by: jun qian <qianjun.kernel@xxxxxxxxx> --- mm/memory.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/mm/memory.c b/mm/memory.c index 7d608765932b..a689bb5d3842 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -591,7 +591,7 @@ static void print_bad_pte(struct vm_area_struct *vma, unsigned long addr, * PFNMAP mappings in order to support COWable mappings. * */ -struct page *vm_normal_page(struct vm_area_struct *vma, unsigned long addr, +inline struct page *vm_normal_page(struct vm_area_struct *vma, unsigned long addr, pte_t pte) { unsigned long pfn = pte_pfn(pte); -- 2.18.2