On Thu, 22 Oct 2015, Ian Kent wrote: > On Wed, 2015-10-21 at 12:56 -0700, Hugh Dickins wrote: > > On Wed, 21 Oct 2015, Ian Kent wrote: > > Thanks for taking the time to reply Hugh. > > > > > > Hi all, > > > > > > I've been looking through some of the page reclaim code and at > > > truncate_inode_pages(). > > > > > > I'm not familiar with the code and I'm struggling to understand it. > > > > > > One thing that is puzzling me right now is, if a file has pages > > that > > > have been modified and are swapped out when > > pagevec_lookup_entries() is > > > called will they be found? > > > > truncate_inode_pages() is a library function which a filesystem calls > > at some stage in its inode truncation processing, to take all the > > incore > > pages out of pagecache (out of its radix_tree), and free them up > > (usually: some might be otherwise pinned in memory at the time). > > > > A filesystem will have other work to do, very particular to that > > filesystem, to free up the actual disk blocks: that's definitely > > not part of truncate_inode_pages()'s job. > > > > It's also called when evicting an inode no longer needed in memory, > > to free the associated pagecache, when not deleting the blocks on > > disk. > > > > I think I don't understand your "swapped out": modifications occur to > > a page while it is in pagecache, and those modifications need to be > > written back to disk before that page can be reclaimed for other use. > > Indeed, now I think about it, "swapped out" is a bad choice of words > when talking about a paged IO system. > > What I'm trying to say is if pages allocated to a mapping are modified, > then under memory pressure, are they ever reclaimed by writing them to > swap storage or are they always reclaimed by writing them back to disk? > > Now I think about what you've said here and looking at the code I > suspect the answer is they are always reclaimed by writing them to > disk. Yes. > > > > > > > > > If not then how does truncate_inode_pages(_range)() handle waiting > > for > > > these pages to be swapped back in to perform the writeback and > > > truncation? > > > > Pages are never "swapped back in to perform the writeback": > > if writeback is needed, it's done before the page can be freed from > > pagecache; and if that data is needed again after the page was freed, > > it's read back in from disk to fresh page. > > That makes sense, using swap would be unnecessary double handling. > > > > > You may be worrying about what happens when a page is modified or > > under writeback when it is truncated: I think that's something each > > filesystem has to be careful of, and may deal with in different ways. > > I'm wondering how a mapping nrpages can be non-zero (read greater than > one) after calling truncate_inode_pages(). > > But I'm looking at a much older kernel so it's quite different to > current upstream and this seemed like a question relevant to both > kernels to get some idea of how page reclaim works. > > I guess what I'm really looking to work out is if it's possible, with > the current upstream kernel, for a mapping to have nrpages greater than > 1 after calling truncate_inode_pages() and hopefully get some > explanation of why if that's not so. I assume you're worrying about a truncate_inode_pages(mapping, 0). If it's truncate_inode_pages(mapping, 1), or lstart anything greater than 0, then it will leave behind the incompletely truncated pages at the start: no mystery in that. > > It's certainly possible with the older kernel I'm looking at but I need > some info. before I consider looking for possible changes to back port. Probably what you're looking for is Jan Kara's v3.0 commit 08142579b6ca "mm: fix assertion mapping->nrpages == 0 in end_writeback()". > > > > > I'm not sure how much to read in to your use of the word "swap". > > It's true that shmem/tmpfs uses swap (of the swapon/swapoff variety) > > as backing for its pages when under pressure (and uses its own > > variant > > shmem_undo_range() to manage that, instead of > > truncate_inode_pages()), > > but most filesystems don't use "swap" at all. > > > > I just noticed your subject "paged I/O sub system": I hope you > > realize > > that mm/page_io.c is solely concerned with swap (of the > > swapon/swapoff > > variety), and has next to nothing to do with filesystems. (Just as, > > conversely, mm/swap.c has next to nothing to do with swap.) > > LOL, right, I'm looking at the page reclaim code which, so far, hasn't > lead me to either of those source files. > > > > > > > > > Anyone, please? > > > > I hope something I've said there has helped, but warn you that > > I'm a terrible person to engage in an extended conversation with! > > Expect long silences, pray for someone else to jump in. > > As well as pointing out that swap storage shouldn't be used in this > case you've reminded me of the difference between swapping and demand > paging, so that's a good start. So long as you leave it as a distant memory: you're right that "swapping" used to mean copying out a whole process to disk and reading in another, but Linux never implemented it that way: it's always been paging out to and in from the swap medium, much like demand paging from file. (I say "never" and "always": I think that's so, but I don't really know beyond v2.4.0.) Hugh > > Perhaps folks at linux-mm will have more to say. > > > > > Ian -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@xxxxxxxxx. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@xxxxxxxxx"> email@xxxxxxxxx </a>