> > By the way - I found another bug/issue in /proc/<pid>/smaps as well. The > > kernel currently does not adds swapped-out shm pages to the swap size value > > correctly. The swap size value always stays zero for shm pages. I'm currently > > preparing a small patch to fix that, which I will send to linux-mm for review > > soon. > > I certainly wouldn't call smaps's present behaviour on it a bug: but given > your justification above, I can see that it would be more useful to you, > and probably to others, for it to be changed in the way that you suggest, > to reveal the underlying swap. > > Hmm, I wonder what that patch is going to look like... :-) I tried quite hard to implement rss/swap accounting for shm segments inside smaps_pte_range() which is a callback function of walk_page_range() in show_smap(). Given the fact that I'm no linux-mm expert, I might have overseen other possibilities, but my experiments inside smaps_pte_range() were not very successful: >From my tests, a swapped-out shm segment - fails on the "is_swap_pte()" test, and - succeeds on the "!pte_present()" test (since it's swapped out). So, here would it be possible to add such accounting for swap, but how can I then see that this pte is a) belonging to a shm segment?, and b) see if this page/pte was really swapped out and not just not yet written to at all? As answers I found: a) (vma->vm_flags & VM_MAYSHARE) is true for shm segments (is this check sufficient?) b) no idea. But if I add this page to the mss.swap entry, all pages including such which haven't been touched yet at all are suddenly counted as swapped-out...? Any hints here would be great... As an alternative solution, I created the following patch. This one works nicely, but it's just a fix-up of the mss.resident and mss.swap values after walk_page_range() was called. It's mostly a copy of the shm_add_rss_swap() function from my previous patch (http://marc.info/?l=linux-mm&m=128171161101817&w=2). Do you think such a fix-up-afterwards-approach is acceptable at all? If yes, a new patch on top of my ipc/shm.c patch would be easy (and small). Comments? Helge diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c index aea1d3f..a1ef7c9 100644 --- a/fs/proc/task_mmu.c +++ b/fs/proc/task_mmu.c @@ -9,6 +9,7 @@ #include <linux/mempolicy.h> #include <linux/swap.h> #include <linux/swapops.h> +#include <linux/shmem_fs.h> #include <asm/elf.h> #include <asm/uaccess.h> @@ -392,6 +393,31 @@ static int show_smap(struct seq_file *m, void *v) if (vma->vm_mm && !is_vm_hugetlb_page(vma)) walk_page_range(vma->vm_start, vma->vm_end, &smaps_walk); +#ifdef CONFIG_SYSVIPC + /* sysvipc shm segments and hugepages are counted wrong in + * walk_page_range(). Fix it up. + */ + if (vma->vm_file && (vma->vm_flags & VM_MAYSHARE)) { + struct inode *inode = vma->vm_file->f_path.dentry->d_inode; + if (is_file_hugepages(vma->vm_file)) { + struct address_space *mapping = inode->i_mapping; + struct hstate *h = hstate_file(vma->vm_file); + mss.resident = pages_per_huge_page(h) * + mapping->nrpages * PAGE_SIZE; + } else { +#ifdef CONFIG_SHMEM + struct shmem_inode_info *info = SHMEM_I(inode); + spin_lock(&info->lock); + mss.resident = inode->i_mapping->nrpages * PAGE_SIZE; + mss.swap = info->swapped * PAGE_SIZE; + spin_unlock(&info->lock); +#else + mss.resident = inode->i_mapping->nrpages * PAGE_SIZE; +#endif + } + } +#endif + show_map_vma(m, vma); seq_printf(m, -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@xxxxxxxxxx For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@xxxxxxxxx"> email@xxxxxxxxx </a>