Hugh Dickins wrote: > On Thu, 12 Jul 2012, Jeff Liu wrote: > > On 07/12/2012 07:01 AM, Dave Chinner wrote: > > > On Wed, Jul 11, 2012 at 11:55:34AM -0700, Hugh Dickins wrote: > > >> > > >> But your vote would count for a lot more if you know of some app which > > >> would really benefit from this functionality in tmpfs: I've heard > > >> of none. ... [Jeff mentioned "cp"] grep is another tool that would benefit. I often put very large files (often sparse, too) on tmpfs file systems and would like "grep -r PAT /tmp" to work well in spite of those files. Please consider restoring SEEK_HOLE/SEEK_DATA support for tmpfs. The lack of cross-FS support in SEEK_HOLE/SEEK_DATA support is a bit of a thorn in our sides. FIEMAP is not a viable option, and SEEK_HOLE support works only if you happen to be using btrfs, xfs, ocfs2 or 3.5.0-rcN tmpfs. Not something we can rely on for a feature whose lack can convert grep -r into a memory-hogging apparently-hung job or OOM-killer-target. What would you like to happen when you run (deliberately or inadvertently) grep on a large sparse file? I want it to search only the non-HOLE sections of that file, especially when examining a hole involves accumulating a "line" that may be so long that it exhausts virtual memory. We're not quite there, but for now can at least avoid the VM-abusing behavior with --binary-file=without-match option, which says to treat "binary" (sparse) files as if they contain no match. Sometimes. With working SEEK_HOLE support, grep does the right thing here: (${AWK-awk} 'BEGIN{ for (i=0;i<1000;i++) printf "%080d\n", 0 }' < /dev/null echo x | dd bs=1024k seek=8000000 ) >8T-or-so $ env time --format=%e grep x 8T-or-so 0.00 But without SEEK_HOLE support, and with a lot of memory, grep takes a long time to allocate all of that space before it finally chokes or is killed. Here, it takes 46 seconds before running out of memory: $ env time grep --binary-file=without-match x 8T-or-so grep: memory exhausted 3.15user 25.48system 0:46.46elapsed 61%CPU\ (0avgtext+0avgdata 12583712maxresident)k 0inputs+8outputs (0major+2733623minor)pagefaults 0swaps [Exit 2] Until very recently, grep was trying to guess whether an input has a hole using st_blocks and st_size, but with file systems now using compression, that method it too subject to false-positives. Ideally we would use SEEK_HOLE/SEEK_DATA, but until that is useful on more linux file systems, I suspect we'll have to choose our method based on the file system type (at the cost of a statvfs call for each st_dev), possibly in combination with the linux kernel version. Here's some background/discussion on the topic, including the original report about the st_blocks-based heuristic not working: http://thread.gmane.org/gmane.comp.gnu.grep.bugs/4604/focus=4610 In case you want to see the SEEK_HOLE-using code, grep's file_is_binary function is here: http://git.savannah.gnu.org/cgit/grep.git/tree/src/main.c#n439 -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html