On Wed, 23 May 2012 13:28:21 +0000 Nathan Zimmer <nzimmer@xxxxxxx> wrote: > > When tmpfs has the memory policy interleaved it always starts allocating at each file at node 0. > When there are many small files the lower nodes fill up disproportionately. > My proposed solution is to start a file at a randomly chosen node. > > ... > > --- a/include/linux/shmem_fs.h > +++ b/include/linux/shmem_fs.h > @@ -17,6 +17,7 @@ struct shmem_inode_info { > char *symlink; /* unswappable short symlink */ > }; > struct shared_policy policy; /* NUMA memory alloc policy */ > + int node_offset; /* bias for interleaved nodes */ > struct list_head swaplist; /* chain of maybes on swap */ > struct list_head xattr_list; /* list of shmem_xattr */ > struct inode vfs_inode; > diff --git a/mm/shmem.c b/mm/shmem.c > index f99ff3e..58ef512 100644 > --- a/mm/shmem.c > +++ b/mm/shmem.c > @@ -819,7 +819,7 @@ static struct page *shmem_alloc_page(gfp_t gfp, > > /* Create a pseudo vma that just contains the policy */ > pvma.vm_start = 0; > - pvma.vm_pgoff = index; > + pvma.vm_pgoff = index + info->node_offset; > pvma.vm_ops = NULL; > pvma.vm_policy = mpol_shared_policy_lookup(&info->policy, index); > > @@ -1153,6 +1153,7 @@ static struct inode *shmem_get_inode(struct super_block *sb, const struct inode > inode->i_fop = &shmem_file_operations; > mpol_shared_policy_init(&info->policy, > shmem_get_sbmpol(sbinfo)); > + info->node_offset = node_random(&node_online_map); > break; > case S_IFDIR: > inc_nlink(inode); The patch seems a bit arbitrary and hacky. It would have helped if you had fully described how it works, and why this implementation was chosen. - Why alter (actually, lie about!) the offset-into-file? Could we have similarly perturbed the address arg to alloc_page_vma() to do the spreading? - The patch is dependent upon MPOL_INTERLEAVE being in effect, isn't it? How do we guarantee that it is in force here? - We look up the policy via mpol_shared_policy_lookup() using the unperturbed index. Why? Should we be using index+info->node_offset there? -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@xxxxxxxxx. For more info on Linux MM, see: http://www.linux-mm.org/ . Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/ Don't email: <a href=mailto:"dont@xxxxxxxxx"> email@xxxxxxxxx </a>