On Thu, 2019-04-25 at 16:15 +0800, Ming Lei wrote: > On Thu, Apr 25, 2019 at 4:13 PM Qian Cai <cai@xxxxxx> wrote: > > > > Memory offline [1] starts to fail on linux-next on ppc64le with > > page_alloc.shuffle=1 where the "echo offline" command hangs with lots of > > migrating failures below. It seems in migrate_page_move_mapping() > > > > if (!mapping) { > > /* Anonymous page without mapping */ > > if (page_count(page) != expected_count) > > return -EAGAIN; > > > > It expected count=1 but actual count=2. > > > > There are two ways to make the problem go away. One is to remove this line > > in > > __shuffle_free_memory(), > > > > shuffle_zone(z); > > > > The other is reverting some bio commits. Bisecting so far indicates the > > culprit > > is in one of those (the 3rd commit looks more suspicious than the others). > > > > block: only allow contiguous page structs in a bio_vec > > block: don't allow multiple bio_iov_iter_get_pages calls per bio > > block: change how we get page references in bio_iov_iter_get_pages > > > > [ 446.578064] migrating pfn 2003d5eaa failed ret:22 > > [ 446.578066] page:c00a00800f57aa80 count:2 mapcount:0 > > mapping:c000001db4c827e9 > > index:0x13c08a > > [ 446.578220] anon > > [ 446.578222] flags: > > 0x83fffc00008002e(referenced|uptodate|dirty|active|swapbacked) > > [ 446.578347] raw: 083fffc00008002e c00a00800f57f808 c00a00800f579f88 > > c000001db4c827e9 > > [ 446.944807] raw: 000000000013c08a 0000000000000000 00000002ffffffff > > c00020141a738008 > > [ 446.944883] page dumped because: migration failure > > [ 446.944948] page->mem_cgroup:c00020141a738008 > > [ 446.945024] page allocated via order 0, migratetype Movable, gfp_mask > > 0x100cca(GFP_HIGHUSER_MOVABLE) > > [ 446.945148] prep_new_page+0x390/0x3a0 > > [ 446.945228] get_page_from_freelist+0xd9c/0x1bf0 > > [ 446.945292] __alloc_pages_nodemask+0x1cc/0x1780 > > [ 446.945335] alloc_pages_vma+0xc0/0x360 > > [ 446.945401] do_anonymous_page+0x244/0xb20 > > [ 446.945472] __handle_mm_fault+0xcf8/0xfb0 > > [ 446.945532] handle_mm_fault+0x1c0/0x2b0 > > [ 446.945615] __get_user_pages+0x3ec/0x690 > > [ 446.945652] get_user_pages_unlocked+0x104/0x2f0 > > [ 446.945693] get_user_pages_fast+0xb0/0x200 > > [ 446.945762] iov_iter_get_pages+0xf4/0x6a0 > > [ 446.945802] bio_iov_iter_get_pages+0xc0/0x450 > > [ 446.945876] blkdev_direct_IO+0x2e0/0x630 > > [ 446.945941] generic_file_read_iter+0xbc/0x230 > > [ 446.945990] blkdev_read_iter+0x50/0x80 > > [ 446.946031] aio_read+0x128/0x1d0 > > [ 446.946082] migrating pfn 2003d5fe0 failed ret:22 > > [ 446.946084] page:c00a00800f57f800 count:2 mapcount:0 > > mapping:c000001db4c827e9 > > index:0x13c19e > > [ 446.946239] anon > > [ 446.946241] flags: > > 0x83fffc00008002e(referenced|uptodate|dirty|active|swapbacked) > > [ 446.946384] raw: 083fffc00008002e c000200deb3dfa28 c00a00800f57aa88 > > c000001db4c827e9 > > [ 446.946497] raw: 000000000013c19e 0000000000000000 00000002ffffffff > > c00020141a738008 > > [ 446.946605] page dumped because: migration failure > > [ 446.946662] page->mem_cgroup:c00020141a738008 > > [ 446.946724] page allocated via order 0, migratetype Movable, gfp_mask > > 0x100cca(GFP_HIGHUSER_MOVABLE) > > [ 446.946846] prep_new_page+0x390/0x3a0 > > [ 446.946899] get_page_from_freelist+0xd9c/0x1bf0 > > [ 446.946959] __alloc_pages_nodemask+0x1cc/0x1780 > > [ 446.947047] alloc_pages_vma+0xc0/0x360 > > [ 446.947101] do_anonymous_page+0x244/0xb20 > > [ 446.947143] __handle_mm_fault+0xcf8/0xfb0 > > [ 446.947200] handle_mm_fault+0x1c0/0x2b0 > > [ 446.947256] __get_user_pages+0x3ec/0x690 > > [ 446.947306] get_user_pages_unlocked+0x104/0x2f0 > > [ 446.947366] get_user_pages_fast+0xb0/0x200 > > [ 446.947458] iov_iter_get_pages+0xf4/0x6a0 > > [ 446.947515] bio_iov_iter_get_pages+0xc0/0x450 > > [ 446.947588] blkdev_direct_IO+0x2e0/0x630 > > [ 446.947636] generic_file_read_iter+0xbc/0x230 > > [ 446.947703] blkdev_read_iter+0x50/0x80 > > [ 446.947758] aio_read+0x128/0x1d0 > > > > [1] > > i=0 > > found=0 > > for mem in $(ls -d /sys/devices/system/memory/memory*); do > > i=$((i + 1)) > > echo "iteration: $i" > > echo offline > $mem/state > > if [ $? -eq 0 ] && [ $found -eq 0 ]; then > > found=1 > > continue > > fi > > echo online > $mem/state > > done > > Please try the following patch: > > https://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux-block.git/commit/? > h=for-5.2/block&id=0257c0ed5ea3de3e32cb322852c4c40bc09d1b97 It works great so far!