On Thu, Apr 30, 2015 at 05:25:31PM +0900, Minchan Kim wrote: > On Wed, Apr 29, 2015 at 12:12:48PM +0300, Vladimir Davydov wrote: > > On Wed, Apr 29, 2015 at 01:35:36PM +0900, Minchan Kim wrote: > > > On Tue, Apr 28, 2015 at 03:24:42PM +0300, Vladimir Davydov wrote: > > > > +#ifdef CONFIG_IDLE_PAGE_TRACKING > > > > +static struct page *kpageidle_get_page(unsigned long pfn) > > > > +{ > > > > + struct page *page; > > > > + > > > > + if (!pfn_valid(pfn)) > > > > + return NULL; > > > > + page = pfn_to_page(pfn); > > > > + /* > > > > + * We are only interested in user memory pages, i.e. pages that are > > > > + * allocated and on an LRU list. > > > > + */ > > > > + if (!page || page_count(page) == 0 || !PageLRU(page)) > > > > + return NULL; > > > > + if (!get_page_unless_zero(page)) > > > > + return NULL; > > > > + if (unlikely(!PageLRU(page))) { > > > > > > What lock protect the check PageLRU? > > > If it is racing ClearPageLRU, what happens? > > > > If we hold a reference to a page and see that it's on an LRU list, it > > will surely remain a user memory page at least until we release the > > reference to it, so it must be safe to play with idle/young flags. If we > > The problem is that you pass the page in rmap reverse logic(ie, page_referenced) > once you judge it's LRU page so if it is false-positive, what happens? > A question is SetPageLRU, PageLRU, ClearPageLRU keeps memory ordering? > IOW, all of fields from struct page rmap can acccess should be set up completely > before LRU checking. Otherwise, something will be broken. So, basically you are concerned about the case when we encounter a freshly allocated page, which has PG_lru bit set and it's going to become anonymous, but it is still in the process of rmap initialization, i.e. its ->mapping or ->mapcount may still be uninitialized, right? AFAICS, page_referenced should handle such pages fine. Look, it only needs ->index, ->mapping, and ->mapcount. If ->mapping is unset, than it is NULL and rmap_walk_anon_lock -> page_lock_anon_vma_read will return NULL so that rmap_walk will be a no-op. If ->index is not initialized, than at worst we will go to anon_vma_interval_tree_foreach over a wrong interval, in which case we will see that the page is actually not mapped in page_referenced_one -> page_check_address and again do nothing. If ->mapcount is not initialized it is -1, and page_lock_anon_vma_read will return NULL, just as it does in case ->mapping = NULL. For file pages, we always take PG_locked before checking ->mapping, so it must be valid. Thanks, Vladimir -- To unsubscribe from this list: send the line "unsubscribe cgroups" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html