On Sun, Apr 12, 2015 at 11:48:23PM +0900, Minchan Kim wrote: > Hello Hugh, > > On Sat, Apr 11, 2015 at 02:40:46PM -0700, Hugh Dickins wrote: > > On Wed, 11 Mar 2015, Minchan Kim wrote: > > > > > Bascially, MADV_FREE relys on the pte dirty to decide whether > > > it allows VM to discard the page. However, if there is swap-in, > > > pte pointed out the page has no pte_dirty. So, MADV_FREE checks > > > PageDirty and PageSwapCache for those pages to not discard it > > > because swapped-in page could live on swap cache or PageDirty > > > when it is removed from swapcache. > > > > > > The problem in here is that anonymous pages can have PageDirty if > > > it is removed from swapcache so that VM cannot parse those pages > > > as freeable even if we did madvise_free. Look at below example. > > > > > > ptr = malloc(); > > > memset(ptr); > > > .. > > > heavy memory pressure -> swap-out all of pages > > > .. > > > out of memory pressure so there are lots of free pages > > > .. > > > var = *ptr; -> swap-in page/remove the page from swapcache. so pte_clean > > > but SetPageDirty > > > > > > madvise_free(ptr); > > > .. > > > .. > > > heavy memory pressure -> VM cannot discard the page by PageDirty. > > > > > > PageDirty for anonymous page aims for avoiding duplicating > > > swapping out. In other words, if a page have swapped-in but > > > live swapcache(ie, !PageDirty), we could save swapout if the page > > > is selected as victim by VM in future because swap device have > > > kept previous swapped-out contents of the page. > > > > > > So, rather than relying on the PG_dirty for working madvise_free, > > > pte_dirty is more straightforward. Inherently, swapped-out page was > > > pte_dirty so this patch restores the dirtiness when swap-in fault > > > happens so madvise_free doesn't rely on the PageDirty any more. > > > > > > Cc: Hugh Dickins <hughd@xxxxxxxxxx> > > > Cc: Cyrill Gorcunov <gorcunov@xxxxxxxxx> > > > Cc: Pavel Emelyanov <xemul@xxxxxxxxxxxxx> > > > Reported-by: Yalin Wang <yalin.wang@xxxxxxxxxxxxxx> > > > Signed-off-by: Minchan Kim <minchan@xxxxxxxxxx> > > > > Sorry, but NAK to this patch, > > mm-make-every-pte-dirty-on-do_swap_page.patch in akpm's mm tree > > (I hope it hasn't reached linux-next yet). > > > > You may well be right that pte_dirty<->PageDirty can be handled > > differently, in a way more favourable to MADV_FREE. And this patch > > may be a step in the right direction, but I've barely given it thought. > > > > As it stands, it segfaults more than any patch I've seen in years: > > I just tried applying it to 4.0-rc7-mm1, and running kernel builds > > in low memory with swap. Even if I leave KSM out, and memcg out, and > > swapoff out, and THP out, and tmpfs out, it still SIGSEGVs very soon. > > > > I have a choice: spend a few hours tracking down the errors, and > > post a fix patch on top of yours? But even then I'd want to spend > > a lot longer thinking through every dirty/Dirty in the source before > > I'd feel comfortable to give an ack. > > > > This is users' data, and we need to be very careful with it: errors > > in MADV_FREE are one thing, for now that's easy to avoid; but in this > > patch you're changing the rules for Anon PageDirty for everyone. > > > > I think for now I'll have to leave it to you to do much more source > > diligence and testing, before coming back with a corrected patch for > > us then to review, slowly and carefully. > > Sorry for my bad. I will keep your advise in mind. > I will investigate the problem as soon as I get back to work > after vacation. > > Thanks for the the review. When I look at the code, migration doesn't restore dirty bit of pte in remove_migration_pte and relys on PG_dirty which was set by try_to_unmap_one. I think it was a reason you saw segfault. I will spend more time to investigate another code piece which might ignore dirty bit restore. Thanks. > > -- > Kind regards, > Minchan Kim -- Kind regards, Minchan Kim -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@xxxxxxxxx. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@xxxxxxxxx"> email@xxxxxxxxx </a>