Re: [Bug 75101] New: [bisected] s2disk / hibernate blocks on "Saving 506031 image data pages () ..."

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



  Hello,

On Mon 05-05-14 11:35:41, Johannes Weiner wrote:
> On Tue, Apr 29, 2014 at 03:24:37PM -0700, Andrew Morton wrote:
> > (switched to email.  Please respond via emailed reply-to-all, not via the
> > bugzilla web interface).
> > 
> > On Tue, 29 Apr 2014 20:13:44 +0000 bugzilla-daemon@xxxxxxxxxxxxxxxxxxx wrote:
> > 
> > > https://bugzilla.kernel.org/show_bug.cgi?id=75101
> > > 
> > >             Bug ID: 75101
> > >            Summary: [bisected] s2disk / hibernate blocks on "Saving 506031
> > >                     image data pages () ..."
> > >            Product: Memory Management
> > >            Version: 2.5
> > >     Kernel Version: v3.14
> > >           Hardware: All
> > >                 OS: Linux
> > >               Tree: Mainline
> > >             Status: NEW
> > >           Severity: normal
> > >           Priority: P1
> > >          Component: Other
> > >           Assignee: akpm@xxxxxxxxxxxxxxxxxxxx
> > >           Reporter: oliverml1@xxxxxxxxxxx
> > >         Regression: No
> > > 
> > > Created attachment 134271
> > >   --> https://bugzilla.kernel.org/attachment.cgi?id=134271&action=edit
> > > Full console trace with various SysRq outputs
> > > 
> > > Since v3.14 under normal desktop usage my s2disk/hibernate often blocks on the
> > > saving of the image data ("Saving 506031 image data pages () ...").
> > 
> > A means to reproduce as well as a bisection result.  Nice!  Thanks.
> > 
> > Johannes, could you please take a look?
> > 
> > > With following test I can reproduce the problem reliably:
> > > ---
> > > 0) Boot
> > > 
> > > 1) Fill ram with 2GiB (+50% in my case)
> > > 
> > > mount -t tmpfs tmpfs /media/test/
> > > dd if=/dev/zero of=/media/test/test0.bin bs=1k count=$[1024*1024]
> > > dd if=/dev/zero of=/media/test/test1.bin bs=1k count=$[1024*1024]
> > > 
> > > 2) Do s2disk 
> > > 
> > > s2disk
> > > 
> > > ---
> > > s2disk: Unable to switch virtual terminals, using the current console.
> > > s2disk: Snapshotting system
> > > s2disk: System snapshot ready. Preparing to write
> > > s2disk: Image size: 2024124 kilobytes
> > > s2disk: Free swap: 3791208 kilobytes
> > > s2disk: Saving 506031 image data pages (press backspace to abort) ...   0%
> > > 
> > > #Problem>: ... there is stays and blocks. SysRq still responds, so that I could
> > > trigger various debug outputs.
> 
> According to your dmesg s2disk is stuck in balance_dirty_pages():
> 
> [  215.645240] s2disk          D ffff88011fd93100     0  3323   3261 0x00000000
> [  215.645240]  ffff8801196d4110 0000000000000082 0000000000013100 ffff8801196d4110
> [  215.645240]  ffff8800365cdfd8 ffff880119ed9190 00000000ffffc16c ffff8800365cdbe8
> [  215.645240]  0000000000000032 0000000000000032 ffff8801196d4110 0000000000000000
> [  215.645240] Call Trace:
> [  215.645240]  [<ffffffff8162fdce>] ? schedule_timeout+0xde/0xff
> [  215.645240]  [<ffffffff81041be1>] ? ftrace_raw_output_tick_stop+0x55/0x55
> [  215.645240]  [<ffffffff81630987>] ? io_schedule_timeout+0x5d/0x7e
> [  215.645240]  [<ffffffff810cb035>] ? balance_dirty_pages_ratelimited+0x588/0x747
> [  215.645240]  [<ffffffff812d0795>] ? radix_tree_tag_set+0x69/0xc4
> [  215.645240]  [<ffffffff810c244e>] ? generic_file_buffered_write+0x1a8/0x21c
> [  215.645240]  [<ffffffff810c351e>] ? __generic_file_aio_write+0x1c7/0x1fe
> [  215.645240]  [<ffffffff81134ab5>] ? blkdev_aio_write+0x44/0x79
> [  215.645240]  [<ffffffff8110c02a>] ? do_sync_write+0x56/0x76
> [  215.645240]  [<ffffffff8110c33c>] ? vfs_write+0xa1/0xfb
> [  215.645240]  [<ffffffff8110ca08>] ? SyS_write+0x41/0x74
> [  215.645240]  [<ffffffff81637622>] ? system_call_fastpath+0x16/0x1b
> 
> but I don't see a flusher thread anywhere.
> 
> What the bisected change does is allow the effective dirty threshold
> to drop fairly low, because anonymous pages are no longer considered
> dirtyable, and your usecase has particularly low free + cache pages:
> 
> [  196.375988] active_anon:328150 inactive_anon:118571 isolated_anon:0
> [  196.375988]  active_file:1658 inactive_file:1823 isolated_file:0
> [  196.375988]  unevictable:867 dirty:616 writeback:0 unstable:0
> [  196.375988]  free:32320 slab_reclaimable:5129 slab_unreclaimable:5080
> [  196.375988]  mapped:2684 shmem:424844 pagetables:1528 bounce:0
> [  196.375988]  free_cma:0
> 
> Ignoring free pages due to dirty_balance_reserve, inactive+active file
> yields 3481 dirtyable pages, which sets the global limits to 174 pages
> background and 348 pages foreground with the default configuration.
> It's low, but not 0.
  OK, so we are over the dirty_limit.

> So why is the dirtier throttled to starvation when the background flusher
> is not even running?  Shouldn't they be looking at the same numbers and
> behave inversely?
  These days there isn't a background flusher thread but a workqueue which
handles the flushing work. But still you should see that in a process list
like "flush-$dev". Can you check whether balance_dirty_pages() properly
calls bdi_start_background_writeback() and whether wb_do_writeback() gets
to run (there are tracepoints in there)?  Also can you have a look in
/sys/kernel/debug/bdi/<dev>/stats? What is the estimated bandwith?

								Honza
-- 
Jan Kara <jack@xxxxxxx>
SUSE Labs, CR

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@xxxxxxxxx.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@xxxxxxxxx";> email@xxxxxxxxx </a>




[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux]     [Linux OMAP]     [Linux MIPS]     [ECOS]     [Asterisk Internet PBX]     [Linux API]