On Tue, Apr 27, 2010 at 7:37 AM, Mel Gorman <mel@xxxxxxxxx> wrote: > From: KAMEZAWA Hiroyuki <kamezawa.hiroyu@xxxxxxxxxxxxxx> > > From: KAMEZAWA Hiroyuki <kamezawa.hiroyu@xxxxxxxxxxxxxx> > > At page migration, we replace pte with migration_entry, which has > similar format as swap_entry and replace it with real pfn at the > end of migration. But there is a race with fork()'s copy_page_range(). > > Assume page migraion on CPU A and fork in CPU B. On CPU A, a page of > a process is under migration. On CPU B, a page's pte is under copy. > > CPUA CPU B > do_fork() > copy_mm() (from process 1 to process2) > insert new vma to mmap_list (if inode/anon_vma) > pte_lock(process1) > unmap a page > insert migration_entry > pte_unlock(process1) > > migrate page copy > copy_page_range > remap new page by rmap_walk() > pte_lock(process2) > found no pte. > pte_unlock(process2) > pte lock(process2) > pte lock(process1) > copy migration entry to process2 > pte unlock(process1) > pte unlokc(process2) > pte_lock(process1) > replace migration entry > to new page's pte. > pte_unlock(process1) > > Then, some serialization is necessary. IIUC, this is very rare event but > it is reproducible if a lot of migration is happening a lot with the > following program running in parallel. > > #include <stdio.h> > #include <string.h> > #include <stdlib.h> > #include <sys/mman.h> > > #define SIZE (24*1048576UL) > #define CHILDREN 100 > int main() > { > int i = 0; > pid_t pids[CHILDREN]; > char *buf = mmap(NULL, SIZE, PROT_READ|PROT_WRITE, > MAP_PRIVATE|MAP_ANONYMOUS, > 0, 0); > if (buf == MAP_FAILED) { > perror("mmap"); > exit(-1); > } > > while (++i) { > int j = i % CHILDREN; > > if (j == 0) { > printf("Waiting on children\n"); > for (j = 0; j < CHILDREN; j++) { > memset(buf, i, SIZE); > if (pids[j] != -1) > waitpid(pids[j], NULL, 0); > } > j = 0; > } > > if ((pids[j] = fork()) == 0) { > memset(buf, i, SIZE); > exit(EXIT_SUCCESS); > } > } > > munmap(buf, SIZE); > } > > copy_page_range() can wait for the end of migration. > > Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@xxxxxxxxxxxxxx> > Signed-off-by: Mel Gorman <mel@xxxxxxxxx> Reviewed-by : Minchan Kim <minchan.kim@xxxxxxxxx> -- Kind regards, Minchan Kim ��.n������g����a����&ޖ)���)��h���&������梷�����Ǟ�m������)������^�����������v���O��zf������