On Thu, Feb 22, 2018 at 6:28 PM, Minchan Kim <minchan@xxxxxxxxxx> wrote: > Plese don't include things not related to this patch. Of course. The inclusion of the stray hunk was unintentional; I didn't mean to suggest that bundling unrelated changes was somehow a good thing. > Furthermore, please use plain-text mail client. > You mangled all of text. It makes communication hard in LKML. Sorry about that. I'll avoid using that email client in the future. >> > diff --git a/include/linux/mm.h b/include/linux/mm.h >> > index ad06d42adb1a..f8129afebbdd 100644 >> > --- a/include/linux/mm.h >> > +++ b/include/linux/mm.h >> > @@ -1507,14 +1507,28 @@ extern int mprotect_fixup(struct vm_area_struct >> *vma, >> > */ >> > int __get_user_pages_fast(unsigned long start, int nr_pages, int write, >> > struct page **pages); >> > + >> > +#ifdef SPLIT_RSS_COUNTING >> > +/* Flush all task-buffered MM counters to the mm */ >> > +void sync_mm_rss_all_users(struct mm_struct *mm); >> >> Really heavy functioin iterates all of processes and threads. >> >> >> Just all processes and the threads of each process attached to the mm. >> Maybe that's not much better. >> >> >> > +#endif >> > + >> > /* >> > * per-process(per-mm_struct) statistics. >> > */ >> > static inline unsigned long get_mm_counter(struct mm_struct *mm, int >> member) >> > { >> > - long val = atomic_long_read(&mm->rss_stat.count[member]); >> > + long val; >> > >> > #ifdef SPLIT_RSS_COUNTING >> > + if (atomic_xchg(&mm->rss_stat.dirty, 0)) >> > + sync_mm_rss_all_users(mm); >> >> So, if we dirty _a_ page, should we iterate all of processes and threads? >> Even, get_mm_counter would be used places without requiring accurate >> numbers. I think you can sync stats on place you really need to rather >> than adding this. >> >> I'd like to see all_threads_sync_mm_rss(mm_struct mm_struct *mm) which >> iterates >> just current's thread group(unless others are against) suggested by peterz. >> And then let's put it on places where you really need(e.g., >> fs/proc/task_mmu.c >> somewhere). >> >> >> I thought about doing it that way, but it seemed odd that reading stats >> from proc should have the side effect of updating counters that things like >> the OOM killer and page scanning might use for their decisions. > > I understand your concern but sync is not cheap if we should iterate all of > tasks. So each call site need to be reviewed which is more critical between > performance and accuracy. If we _really_ should be accurate in all of places, > then we should consider other way to avoid iterating of task_structs, IMO. > I guess it could make a long discussion thread about _it's really worth to do_. I'm thinking that *in general*, stale values can have unforeseen undesired effects, especially if the values become up-to-date when you try to view them, making it hard to debug the source of the problem --- so if it's at all possible to make everyone see up-to-date values, we should do that. If it's not possible, sure, we can tolerate stale values. Do you think my list-of-dirty-tasks proposal would be cheap enough? In that scheme, if you dirty one page, you look at only that one task. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@xxxxxxxxx. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@xxxxxxxxx"> email@xxxxxxxxx </a>