On 07/02/2019 11:44, Marc Gonzalez wrote: > + linux-mm > > Summarizing the issue for linux-mm readers: > > If I read data from a storage device larger than my system's RAM, the system freezes > once dd has read more data than available RAM. > > # dd if=/dev/sde of=/dev/null bs=1M & while true; do echo m > /proc/sysrq-trigger; echo; echo; sleep 1; done > https://pastebin.ubuntu.com/p/HXzdqDZH4W/ > > A few seconds before the system hangs, Mem-Info shows: > > [ 90.986784] Node 0 active_anon:7060kB inactive_anon:13644kB active_file:0kB inactive_file:3797500kB [...] > > => 3797500kB is basically all of RAM. > > I tried to locate where "inactive_file" was being increased from, and saw two signatures: > > [ 255.606019] __mod_node_page_state | __pagevec_lru_add_fn | pagevec_lru_move_fn | __lru_cache_add | lru_cache_add | add_to_page_cache_lru | mpage_readpages | blkdev_readpages | read_pages | __do_page_cache_readahead | ondemand_readahead | page_cache_sync_readahead > > [ 255.637238] __mod_node_page_state | __pagevec_lru_add_fn | pagevec_lru_move_fn | __lru_cache_add | lru_cache_add | lru_cache_add_active_or_unevictable | __handle_mm_fault | handle_mm_fault | do_page_fault | do_translation_fault | do_mem_abort | el1_da > > Are these expected? > > NB: the system does not hang if I specify 'iflag=direct' to dd. > > According to the RCU watchdog: > > [ 108.466240] rcu: INFO: rcu_preempt detected stalls on CPUs/tasks: > [ 108.466420] rcu: 1-...0: (130 ticks this GP) idle=79e/1/0x4000000000000000 softirq=2393/2523 fqs=2626 > [ 108.471436] rcu: (detected by 4, t=5252 jiffies, g=133, q=85) > [ 108.480605] Task dump for CPU 1: > [ 108.486483] kworker/1:1H R running task 0 680 2 0x0000002a > [ 108.489977] Workqueue: kblockd blk_mq_run_work_fn > [ 108.496908] Call trace: > [ 108.501513] __switch_to+0x174/0x1e0 > [ 108.503757] blk_mq_run_work_fn+0x28/0x40 > [ 108.507589] process_one_work+0x208/0x480 > [ 108.511486] worker_thread+0x48/0x460 > [ 108.515480] kthread+0x124/0x130 > [ 108.519123] ret_from_fork+0x10/0x1c > > Can anyone shed some light on what's going on? Saw a slightly different report from another test run: https://pastebin.ubuntu.com/p/jCywbKgRCq/ [ 340.689764] rcu: INFO: rcu_preempt detected stalls on CPUs/tasks: [ 340.689992] rcu: 1-...0: (8548 ticks this GP) idle=c6e/1/0x4000000000000000 softirq=82/82 fqs=6 [ 340.694977] rcu: (detected by 5, t=5430 jiffies, g=-719, q=16) [ 340.703803] Task dump for CPU 1: [ 340.709507] dd R running task 0 675 673 0x00000002 [ 340.713018] Call trace: [ 340.720059] __switch_to+0x174/0x1e0 [ 340.722192] 0xffffffc0f6dc9600 [ 352.689742] BUG: workqueue lockup - pool cpus=1 node=0 flags=0x0 nice=0 stuck for 33s! [ 352.689910] Showing busy workqueues and worker pools: [ 352.696743] workqueue mm_percpu_wq: flags=0x8 [ 352.701753] pwq 2: cpus=1 node=0 flags=0x0 nice=0 active=1/256 [ 352.706099] pending: vmstat_update [ 384.693730] BUG: workqueue lockup - pool cpus=1 node=0 flags=0x0 nice=0 stuck for 65s! [ 384.693815] Showing busy workqueues and worker pools: [ 384.700577] workqueue events: flags=0x0 [ 384.705699] pwq 0: cpus=0 node=0 flags=0x0 nice=0 active=1/256 [ 384.709351] pending: vmstat_shepherd [ 384.715587] workqueue mm_percpu_wq: flags=0x8 [ 384.719495] pwq 2: cpus=1 node=0 flags=0x0 nice=0 active=1/256 [ 384.723754] pending: vmstat_update