Hi Konstantin, This patch is already in upstream, but I have another idea of implementing the similar feature. So let me review this now, and I'll post patches to complement this patch. On Wed, Feb 26, 2014 at 11:57:23AM +0400, Konstantin Khlebnikov wrote: > After this patch 'page-types' can walk on filesystem mappings and analize > populated page cache pages mostly without disturbing its state. > > It maps chunk of file, marks VMA as MADV_RANDOM to turn off readahead, > pokes VMA via mincore() to determine cached pages, triggers page-fault > only for them, and finally gathers information via pagemap/kpageflags. > Before unmap it marks VMA as MADV_SEQUENTIAL for ignoring reference bits. I think that with this patch page-types *does* disturb page cache (not only of the target file) because it newly populates the pages not faulted in when page-types starts, which rotates LRU list and adds memory pressure. To minimize the measurement-disturbance, we need some help in the kernel side. > > usage: page-types -f <path> > > If <path> is directory it will analyse all files in all subdirectories. I think -f was reserved for "Walk file address space", so doing file tree walk looks to me overkill. You can add "directory mode (-d) for this purpose, although it seems to me that we can/should do this (for example) by combining with find command. I can show you the example in my patch later. > Symlinks are not followed as well as mount points. Hardlinks aren't handled, > they'll be dumbed as many times as they are found. Recursive walk brings all > dentries into dcache and populates page cache of block-devices aka 'Buffers'. > > Probably it's worth to add ioctl for dumping file page cache as array of PFNs > as a replacement for this hackish juggling with mmap/madvise/mincore/pagemap. > > Also recursive walk could be replaced with dumping cached inodes via some ioctl > or debugfs interface followed by openning them via open_by_handle_at, this > would fix hardlinks handling and unneeded population of dcache and buffers. > This interface might be used as data source for constructing readahead plans > and for background optimizations of actively used files. > > collateral changes: > + fix 64-bit LFS: define _FILE_OFFSET_BITS instead of _LARGEFILE64_SOURCE > + replace lseek + read with single pread Good, thanks. > + make show_page_range() reusable after flush > > > usage example: > > ~/src/linux/tools/vm$ sudo ./page-types -L -f page-types > foffset offset flags > page-types Inode: 2229277 Size: 89065 (22 pages) > Modify: Tue Feb 25 12:00:59 2014 (162 seconds ago) > Access: Tue Feb 25 12:01:00 2014 (161 seconds ago) I don't see why page-types needs to show these information. We have many other tools to check file info, so this small program should focus on page related things. Thanks, Naoya Horiguchi > 0 3cbf3b __RU_lA____M________________________ > 1 38946a __RU_lA____M________________________ > 2 1a3cec __RU_lA____M________________________ > 3 1a8321 __RU_lA____M________________________ > 4 3af7cc __RU_lA____M________________________ > 5 1ed532 __RU_lA_____________________________ > 6 2e436a __RU_lA_____________________________ > 7 29a35e ___U_lA_____________________________ > 8 2de86e ___U_lA_____________________________ > 9 3bdfb4 ___U_lA_____________________________ > 10 3cd8a3 ___U_lA_____________________________ > 11 2afa50 ___U_lA_____________________________ > 12 2534c2 ___U_lA_____________________________ > 13 1b7a40 ___U_lA_____________________________ > 14 17b0be ___U_lA_____________________________ > 15 392b0c ___U_lA_____________________________ > 16 3ba46a __RU_lA_____________________________ > 17 397dc8 ___U_lA_____________________________ > 18 1f2a36 ___U_lA_____________________________ > 19 21fd30 __RU_lA_____________________________ > 20 2c35ba __RU_l______________________________ > 21 20f181 __RU_l______________________________ > > > flags page-count MB symbolic-flags long-symbolic-flags > 0x000000000000002c 2 0 __RU_l______________________________ referenced,uptodate,lru > 0x0000000000000068 11 0 ___U_lA_____________________________ uptodate,lru,active > 0x000000000000006c 4 0 __RU_lA_____________________________ referenced,uptodate,lru,active > 0x000000000000086c 5 0 __RU_lA____M________________________ referenced,uptodate,lru,active,mmap > total 22 0 > > > > ~/src/linux/tools/vm$ sudo ./page-types -f / > flags page-count MB symbolic-flags long-symbolic-flags > 0x0000000000000028 21761 85 ___U_l______________________________ uptodate,lru > 0x000000000000002c 127279 497 __RU_l______________________________ referenced,uptodate,lru > 0x0000000000000068 74160 289 ___U_lA_____________________________ uptodate,lru,active > 0x000000000000006c 84469 329 __RU_lA_____________________________ referenced,uptodate,lru,active > 0x000000000000007c 1 0 __RUDlA_____________________________ referenced,uptodate,dirty,lru,active > 0x0000000000000228 370 1 ___U_l___I__________________________ uptodate,lru,reclaim > 0x0000000000000828 49 0 ___U_l_____M________________________ uptodate,lru,mmap > 0x000000000000082c 126 0 __RU_l_____M________________________ referenced,uptodate,lru,mmap > 0x0000000000000868 137 0 ___U_lA____M________________________ uptodate,lru,active,mmap > 0x000000000000086c 12890 50 __RU_lA____M________________________ referenced,uptodate,lru,active,mmap > total 321242 1254 > > Signed-off-by: Konstantin Khlebnikov <koct9i@xxxxxxxxx> > --- > tools/vm/page-types.c | 170 ++++++++++++++++++++++++++++++++++++++++++++----- > 1 file changed, 152 insertions(+), 18 deletions(-) -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@xxxxxxxxx. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@xxxxxxxxx"> email@xxxxxxxxx </a>