On Mon, Apr 24, 2017 at 10:30 AM, Kirill A. Shutemov <kirill@xxxxxxxxxxxxx> wrote: > On Mon, Apr 24, 2017 at 10:23:59AM -0700, Dan Williams wrote: >> On Sun, Apr 23, 2017 at 4:31 PM, Kirill A. Shutemov >> <kirill@xxxxxxxxxxxxx> wrote: >> > On Thu, Apr 20, 2017 at 02:46:51PM -0700, Dan Williams wrote: >> >> On Sat, Mar 18, 2017 at 2:52 AM, tip-bot for Kirill A. Shutemov >> >> <tipbot@xxxxxxxxx> wrote: >> >> > Commit-ID: 2947ba054a4dabbd82848728d765346886050029 >> >> > Gitweb: http://git.kernel.org/tip/2947ba054a4dabbd82848728d765346886050029 >> >> > Author: Kirill A. Shutemov <kirill.shutemov@xxxxxxxxxxxxxxx> >> >> > AuthorDate: Fri, 17 Mar 2017 00:39:06 +0300 >> >> > Committer: Ingo Molnar <mingo@xxxxxxxxxx> >> >> > CommitDate: Sat, 18 Mar 2017 09:48:03 +0100 >> >> > >> >> > x86/mm/gup: Switch GUP to the generic get_user_page_fast() implementation >> >> > >> >> > This patch provides all required callbacks required by the generic >> >> > get_user_pages_fast() code and switches x86 over - and removes >> >> > the platform specific implementation. >> >> > >> >> > Signed-off-by: Kirill A. Shutemov <kirill.shutemov@xxxxxxxxxxxxxxx> >> >> > Cc: Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx> >> >> > Cc: Aneesh Kumar K . V <aneesh.kumar@xxxxxxxxxxxxxxxxxx> >> >> > Cc: Borislav Petkov <bp@xxxxxxxxx> >> >> > Cc: Catalin Marinas <catalin.marinas@xxxxxxx> >> >> > Cc: Dann Frazier <dann.frazier@xxxxxxxxxxxxx> >> >> > Cc: Dave Hansen <dave.hansen@xxxxxxxxx> >> >> > Cc: H. Peter Anvin <hpa@xxxxxxxxx> >> >> > Cc: Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx> >> >> > Cc: Peter Zijlstra <peterz@xxxxxxxxxxxxx> >> >> > Cc: Rik van Riel <riel@xxxxxxxxxx> >> >> > Cc: Steve Capper <steve.capper@xxxxxxxxxx> >> >> > Cc: Thomas Gleixner <tglx@xxxxxxxxxxxxx> >> >> > Cc: linux-arch@xxxxxxxxxxxxxxx >> >> > Cc: linux-mm@xxxxxxxxx >> >> > Link: http://lkml.kernel.org/r/20170316213906.89528-1-kirill.shutemov@xxxxxxxxxxxxxxx >> >> > [ Minor readability edits. ] >> >> > Signed-off-by: Ingo Molnar <mingo@xxxxxxxxxx> >> >> >> >> I'm still trying to spot the bug, but bisect points to this patch as >> >> the point at which my unit tests start failing with the following >> >> signature: >> >> >> >> [ 35.423841] WARNING: CPU: 8 PID: 245 at lib/percpu-refcount.c:155 >> >> percpu_ref_switch_to_atomic_rcu+0x1f5/0x200 >> > >> > Okay, I've tracked it down. The issue is triggered by replacment >> > get_page() with page_cache_get_speculative(). >> > >> > page_cache_get_speculative() doesn't have get_zone_device_page(). :-| >> > >> > And I think it's your bug, Dan: it's wrong to have >> > get_/put_zone_device_page() in get_/put_page(). I must be handled by >> > page_ref_* machinery to catch all cases where we manipulate with page >> > refcount. >> >> The page_ref conversion landed in 4.6 *after* the ZONE_DEVICE >> implementation that landed in 4.5, so there was a missed conversion of >> the zone-device reference counting to page_ref. > > Fair enough. > > But get_page_unless_zero() definitely predates ZONE_DEVICE. :) > It does, but that's deliberate. A ZONE_DEVICE page never has a zero reference count, it's always owned by the device, never by the page allocator. ZONE_DEVICE overrides the ->lru list_head to store private device information and we rely on the behavior that a non-zero reference means the page is not added to any lru or page cache list. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@xxxxxxxxx. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@xxxxxxxxx"> email@xxxxxxxxx </a>