Hello. I noticed that printing to consoles can stop forever because console drivers can wait for memory allocation when memory allocation cannot make progress. I was testing almost OOM situation and reproduced a situation where allocation requests fall into infinite too_many_isolated() trap in shrink_inactive_list(). Since nothing will be printed to consoles when we hit that situation, I tried to confirm that I actually reproduced that situation using SysRq-t. However, SysRq-t printed nothing for some reason. Therefore, I used SysRq-c at uptime = 910. Complete log is at http://I-love.SAKURA.ne.jp/tmp/serial-20170706.txt.xz : ------------------------------------------------------------ [ 779.494100] Mem-Info: [ 779.495493] active_anon:846948 inactive_anon:5416 isolated_anon:0 [ 779.495493] active_file:33 inactive_file:13 isolated_file:47 [ 779.495493] unevictable:0 dirty:11 writeback:0 unstable:0 [ 779.495493] slab_reclaimable:0 slab_unreclaimable:11 [ 779.495493] mapped:3580 shmem:5581 pagetables:7491 bounce:0 [ 779.495493] free:21354 free_pcp:87 free_cma:0 [ 779.510664] Node 0 active_anon:3387792kB inactive_anon:21664kB active_file:132kB inactive_file:52kB unevictable:0kB isolated(anon):0kB isolated(file):188kB mapped:14320kB dirty:44kB writeback:0kB shmem:22324kB shmem_thp: 0kB shmem_pmdmapped: 0kB anon_thp: 3067904kB writeback_tmp:0kB unstable:0kB all_unreclaimable? no [ 779.522778] Node 0 DMA free:14836kB min:284kB low:352kB high:420kB active_anon:1024kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB writepending:0kB present:15988kB managed:15904kB mlocked:0kB kernel_stack:0kB pagetables:8kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB [ 779.533972] lowmem_reserve[]: 0 2700 3642 3642 [ 779.536321] Node 0 DMA32 free:53656kB min:49888kB low:62360kB high:74832kB active_anon:2710360kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB writepending:0kB present:3129216kB managed:2765492kB mlocked:0kB kernel_stack:0kB pagetables:12kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB [ 779.548127] lowmem_reserve[]: 0 0 942 942 [ 779.550326] Node 0 Normal free:16924kB min:17404kB low:21752kB high:26100kB active_anon:676260kB inactive_anon:21664kB active_file:340kB inactive_file:244kB unevictable:0kB writepending:44kB present:1048576kB managed:964816kB mlocked:0kB kernel_stack:13600kB pagetables:29944kB bounce:0kB free_pcp:348kB local_pcp:0kB free_cma:0kB [ 779.563405] lowmem_reserve[]: 0 0 0 0 [ 779.565537] Node 0 DMA: 1*4kB (U) 0*8kB 1*16kB (U) 1*32kB (U) 1*64kB (U) 1*128kB (U) 1*256kB (U) 0*512kB 2*1024kB (UM) 0*2048kB 3*4096kB (M) = 14836kB [ 779.571291] Node 0 DMA32: 6*4kB (UM) 8*8kB (UM) 8*16kB (UM) 20*32kB (U) 29*64kB (UM) 32*128kB (UM) 13*256kB (UM) 9*512kB (UM) 10*1024kB (UM) 0*2048kB 7*4096kB (UME) = 53656kB [ 779.578672] Node 0 Normal: 734*4kB (UE) 478*8kB (UME) 266*16kB (UE) 163*32kB (UE) 7*64kB (U) 3*128kB (UM) 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 17064kB [ 779.585618] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB [ 779.589721] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB [ 779.593765] 5660 total pagecache pages [ 779.596059] 0 pages in swap cache [ 779.598144] Swap cache stats: add 0, delete 0, find 0/0 [ 779.600947] Free swap = 0kB [ 779.602837] Total swap = 0kB [ 779.604716] 1048445 pages RAM [ 779.606691] 0 pages HighMem/MovableOnly [ 779.608974] 111892 pages reserved [ 779.611075] 0 pages cma reserved [ 779.613118] 0 pages hwpoisoned [ 779.615117] Out of memory: Kill process 3655 (c.out) score 999 or sacrifice child [ 779.618790] Killed process 3655 (c.out) total-vm:4168kB, anon-rss:88kB, file-rss:0kB, shmem-rss:0kB [ 779.625401] oom_reaper: reaped process 3655 (c.out), now anon-rss:0kB, file-rss:0kB, shmem-rss:0kB ** 1838 printk messages dropped ** [ 912.886103] __do_fault+0x19/0xf0 [ 912.886108] __handle_mm_fault+0xb0b/0x1030 [ 912.886114] handle_mm_fault+0xf4/0x220 [ 912.886117] __do_page_fault+0x25b/0x4a0 [ 912.886120] do_page_fault+0x30/0x80 [ 912.886124] ? do_syscall_64+0xfd/0x140 [ 912.886127] page_fault+0x28/0x30 (...snipped...) [ 912.892027] kworker/3:3 D12120 3217 2 0x00000080 [ 912.892041] Workqueue: events console_callback [ 912.892042] Call Trace: [ 912.892047] __schedule+0x23f/0x5d0 [ 912.892051] schedule+0x31/0x80 [ 912.892056] schedule_preempt_disabled+0x9/0x10 [ 912.892061] __mutex_lock.isra.2+0x2ac/0x4d0 [ 912.892068] __mutex_lock_slowpath+0xe/0x10 [ 912.892072] ? __mutex_lock_slowpath+0xe/0x10 [ 912.892077] mutex_lock+0x2a/0x30 [ 912.892105] vmw_fb_pan_display+0x35/0x90 [vmwgfx] [ 912.892114] fb_pan_display+0xca/0x160 [ 912.892118] bit_update_start+0x1b/0x40 [ 912.892123] fbcon_switch+0x4a6/0x630 [ 912.892128] redraw_screen+0x15a/0x240 [ 912.892132] ? update_attr.isra.3+0x90/0x90 [ 912.892139] complete_change_console+0x3d/0xd0 [ 912.892143] change_console+0x57/0x90 [ 912.892147] console_callback+0x116/0x190 [ 912.892153] process_one_work+0x1f5/0x390 [ 912.892156] worker_thread+0x46/0x410 [ 912.892161] ? __schedule+0x247/0x5d0 [ 912.892165] kthread+0xff/0x140 [ 912.892170] ? process_one_work+0x390/0x390 [ 912.892174] ? kthread_create_on_node+0x60/0x60 [ 912.892178] ? do_syscall_64+0x13a/0x140 [ 912.892181] ret_from_fork+0x25/0x30 (...snipped...) [ 912.934633] kworker/0:0 D12824 4263 2 0x00000080 [ 912.934643] Workqueue: events vmw_fb_dirty_flush [vmwgfx] [ 912.934643] Call Trace: [ 912.934645] __schedule+0x23f/0x5d0 [ 912.934646] schedule+0x31/0x80 [ 912.934647] schedule_timeout+0x189/0x290 [ 912.934649] ? del_timer_sync+0x40/0x40 [ 912.934650] io_schedule_timeout+0x19/0x40 [ 912.934651] ? io_schedule_timeout+0x19/0x40 [ 912.934653] congestion_wait+0x7d/0xd0 [ 912.934654] ? wait_woken+0x80/0x80 [ 912.934654] shrink_inactive_list+0x3e3/0x4d0 [ 912.934656] shrink_node_memcg+0x360/0x780 [ 912.934657] ? list_lru_count_one+0x65/0x70 [ 912.934658] shrink_node+0xdc/0x310 [ 912.934658] ? shrink_node+0xdc/0x310 [ 912.934659] do_try_to_free_pages+0xea/0x370 [ 912.934660] try_to_free_pages+0xc3/0x100 [ 912.934661] __alloc_pages_slowpath+0x441/0xd50 [ 912.934663] ? ___slab_alloc+0x1b6/0x590 [ 912.934664] __alloc_pages_nodemask+0x20c/0x250 [ 912.934665] alloc_pages_current+0x65/0xd0 [ 912.934666] new_slab+0x472/0x600 [ 912.934668] ___slab_alloc+0x41b/0x590 [ 912.934685] ? drm_modeset_lock_all+0x1b/0xa0 [drm] [ 912.934691] ? drm_modeset_lock_all+0x1b/0xa0 [drm] [ 912.934692] __slab_alloc+0x1b/0x30 [ 912.934693] ? __slab_alloc+0x1b/0x30 [ 912.934694] kmem_cache_alloc+0x16b/0x1c0 [ 912.934699] drm_modeset_lock_all+0x1b/0xa0 [drm] [ 912.934702] vmw_framebuffer_dmabuf_dirty+0x47/0x1d0 [vmwgfx] [ 912.934706] vmw_fb_dirty_flush+0x229/0x270 [vmwgfx] [ 912.934708] process_one_work+0x1f5/0x390 [ 912.934709] worker_thread+0x46/0x410 [ 912.934710] ? __schedule+0x247/0x5d0 [ 912.934711] kthread+0xff/0x140 [ 912.934712] ? process_one_work+0x390/0x390 [ 912.934713] ? kthread_create_on_node+0x60/0x60 [ 912.934714] ret_from_fork+0x25/0x30 (...snipped...) [ 912.934765] Showing busy workqueues and worker pools: [ 912.934785] workqueue events: flags=0x0 [ 912.934798] pwq 6: cpus=3 node=0 flags=0x0 nice=0 active=2/256 [ 912.934801] in-flight: 3217:console_callback{79553} console_callback{79553} [ 912.934806] pwq 0: cpus=0 node=0 flags=0x0 nice=0 active=2/256 [ 912.934807] in-flight: 4263:vmw_fb_dirty_flush [vmwgfx]{130989} vmw_fb_dirty_flush [vmwgfx]{130989} [ 912.934849] workqueue events_freezable_power_: flags=0x84 [ 912.934865] pwq 0: cpus=0 node=0 flags=0x0 nice=0 active=1/256 [ 912.934866] in-flight: 4264:disk_events_workfn{131975} [ 912.934906] workqueue writeback: flags=0x4e [ 912.934907] pwq 128: cpus=0-63 flags=0x4 nice=0 active=2/256 [ 912.934909] in-flight: 357:wb_workfn{130154} wb_workfn{130154} [ 912.935347] pool 0: cpus=0 node=0 flags=0x0 nice=0 hung=0s workers=5 idle: 33 3214 237 [ 912.935351] pool 6: cpus=3 node=0 flags=0x0 nice=0 hung=0s workers=4 idle: 29 95 3215 [ 912.935364] pool 128: cpus=0-63 flags=0x4 nice=0 hung=130s workers=3 idle: 358 356 [ 925.093875] sysrq: SysRq : Trigger a crash [ 925.093903] BUG: unable to handle kernel NULL pointer dereference at (null) ------------------------------------------------------------ Pressing SysRq-c caused all locks to be released (doesn't it ?), and console resumed printing pending output. The {value} after each work is delay in jiffies since that work was queued (as suggested at http://lkml.kernel.org/r/201706291957.JGH39511.tQMOFSLOFJVHOF@xxxxxxxxxxxxxxxxxxx ). Since disk_events_workfn was stalling for 131975 jiffies (CONFIG_HZ=1000) as of uptime = 912, disk_events_workfn started stalling at uptime = 781 (which was 2 seconds after the OOM killer/reaper reclaimed some memory for the last time). Since vmw_fb_dirty_flush was stalling for 130989 jiffies, vmw_fb_dirty_flush started stalling at uptime = 782. And drm_modeset_lock_all() from vmw_fb_dirty_flush work started GFP_KERNEL memory allocation ---------- void drm_modeset_lock_all(struct drm_device *dev) { struct drm_mode_config *config = &dev->mode_config; struct drm_modeset_acquire_ctx *ctx; int ret; ctx = kzalloc(sizeof(*ctx), GFP_KERNEL); if (WARN_ON(!ctx)) return; (...snipped...) } ---------- with par->bo_mutex mutex held at vmw_fb_dirty_flush(). ---------- static void vmw_fb_dirty_flush(struct work_struct *work) { struct vmw_fb_par *par = container_of(work, struct vmw_fb_par, local_work.work); struct vmw_private *vmw_priv = par->vmw_priv; struct fb_info *info = vmw_priv->fb_info; unsigned long irq_flags; s32 dst_x1, dst_x2, dst_y1, dst_y2, w, h; u32 cpp, max_x, max_y; struct drm_clip_rect clip; struct drm_framebuffer *cur_fb; u8 *src_ptr, *dst_ptr; if (vmw_priv->suspended) return; mutex_lock(&par->bo_mutex); (...snipped...) WARN_ON_ONCE(par->set_fb->funcs->dirty(cur_fb, NULL, 0, 0, &clip, 1)); (...snipped...) mutex_unlock(&par->bo_mutex); } ---------- Since console_callback was stalling for 79553 jiffies, console_callback started stalling at uptime = 833. (Thus, maybe there were OOM killer activities a few more times, but they if any are hidden into "** 1838 printk messages dropped **" message.) vmw_fb_pan_display() from fb_pan_display() from console_callback work was waiting for par->bo_mutex mutex held at vmw_fb_dirty_flush() which cannot make progress due to infinite too_many_isolated() trap. ---------- static int vmw_fb_pan_display(struct fb_var_screeninfo *var, struct fb_info *info) { struct vmw_fb_par *par = info->par; if ((var->xoffset + var->xres) > var->xres_virtual || (var->yoffset + var->yres) > var->yres_virtual) { DRM_ERROR("Requested panning can not fit in framebuffer\n"); return -EINVAL; } mutex_lock(&par->bo_mutex); par->fb_x = var->xoffset; par->fb_y = var->yoffset; if (par->set_fb) vmw_fb_dirty_mark(par, par->fb_x, par->fb_y, par->set_fb->width, par->set_fb->height); mutex_unlock(&par->bo_mutex); return 0; } ---------- As a result, console was not able to print SysRq-t output. So, how should we avoid this problem? Of course, this problem would not have occurred from the beginning if we did not fall into infinite too_many_isolated() trap in shrink_inactive_list() (as proposed at http://lkml.kernel.org/r/20170307133057.26182-1-mhocko@xxxxxxxxxx ). But should fbcon, drm, tty and so on stop using __GFP_DIRECT_RECLAIM memory allocations because consoles should be as responsive as printk() ? _______________________________________________ dri-devel mailing list dri-devel@xxxxxxxxxxxxxxxxxxxxx https://lists.freedesktop.org/mailman/listinfo/dri-devel