Laura Abbott wrote: > On 02/26/2018 06:28 AM, Michal Hocko wrote: > > On Fri 23-02-18 11:51:41, Laura Abbott wrote: > >> Hi, > >> > >> The Fedora arm-32 build VMs have a somewhat long standing problem > >> of hanging when running mkfs.ext4 with a bunch of processes stuck > >> in D state. This has been seen as far back as 4.13 but is still > >> present on 4.14: > >> > > [...] > >> This looks like everything is blocked on the writeback completing but > >> the writeback has been throttled. According to the infra team, this problem > >> is _not_ seen without LPAE (i.e. only 4G of RAM). I did see > >> https://patchwork.kernel.org/patch/10201593/ but that doesn't seem to > >> quite match since this seems to be completely stuck. Any suggestions to > >> narrow the problem down? > > > > How much dirtyable memory does the system have? We do allow only lowmem > > to be dirtyable by default on 32b highmem systems. Maybe you have the > > lowmem mostly consumed by the kernel memory. Have you tried to enable > > highmem_is_dirtyable? > > > > Setting highmem_is_dirtyable did fix the problem. The infrastructure > people seemed satisfied enough with this (and are happy to have the > machines back). That's good. > I'll see if they are willing to run a few more tests > to get some more state information. Well, I'm far from understanding what is happening in your case, but I'm interested in other threads which were trying to allocate memory. Therefore, I appreciate if they can take SysRq-m + SysRq-t than SysRq-w (as described at http://akari.osdn.jp/capturing-kernel-messages.html ). Code which assumes that kswapd can make progress can get stuck when kswapd is blocked somewhere. And wbt_wait() seems to change behavior based on current_is_kswapd(). If everyone is waiting for kswapd but kswapd cannot make progress, I worry that it leads to hangups like your case. Below is a totally different case which I got today, but an example of whether SysRq-m + SysRq-t can give us some clues. Running below program on CPU 0 (using "taskset -c 0") on 4.16-rc4 against XFS can trigger OOM lockups (hangup without being able to invoke the OOM killer). ---------- #include <stdio.h> #include <string.h> #include <stdlib.h> #include <unistd.h> #include <sys/types.h> #include <sys/stat.h> #include <fcntl.h> int main(int argc, char *argv[]) { static char buffer[4096] = { }; char *buf = NULL; unsigned long size; unsigned long i; for (i = 0; i < 1024; i++) { if (fork() == 0) { int fd; snprintf(buffer, sizeof(buffer), "/tmp/file.%u", getpid()); fd = open(buffer, O_WRONLY | O_CREAT | O_APPEND, 0600); memset(buffer, 0, sizeof(buffer)); sleep(1); while (write(fd, buffer, sizeof(buffer)) == sizeof(buffer)); _exit(0); } } for (size = 1048576; size < 512UL * (1 << 30); size <<= 1) { char *cp = realloc(buf, size); if (!cp) { size >>= 1; break; } buf = cp; } sleep(2); /* Will cause OOM due to overcommit */ for (i = 0; i < size; i += 4096) buf[i] = 0; return 0; } ---------- MM people love to ignore such kind of problem with "It is a DoS attack", but only one CPU out of 8 CPUs is occupied by this program, which means that other threads (including kernel threads doing memory reclaim activities) are free to use idle CPUs 1-7 as they need. Also, while CPU 0 was really busy processing hundreds of threads doing direct reclaim, idle CPUs 1-7 should be able to invoke the OOM killer shortly because there should be already little to reclaim. Also, writepending: did not decrease (and no disk I/O was observed) during the OOM lockup. Thus, I don't know whether this is just an overloaded. [ 660.035957] Node 0 Normal free:17056kB min:17320kB low:21648kB high:25976kB active_anon:570132kB inactive_anon:13452kB active_file:15136kB inactive_file:13296kB unevictable:0kB writepending:42320kB present:1048576kB managed:951188kB mlocked:0kB kernel_stack:22448kB pagetables:37304kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB [ 709.498421] Node 0 Normal free:16920kB min:17320kB low:21648kB high:25976kB active_anon:570132kB inactive_anon:13452kB active_file:19180kB inactive_file:17640kB unevictable:0kB writepending:42740kB present:1048576kB managed:951188kB mlocked:0kB kernel_stack:22400kB pagetables:37304kB bounce:0kB free_pcp:248kB local_pcp:0kB free_cma:0kB [ 751.290146] Node 0 Normal free:16920kB min:17320kB low:21648kB high:25976kB active_anon:570132kB inactive_anon:13452kB active_file:14556kB inactive_file:14452kB unevictable:0kB writepending:42740kB present:1048576kB managed:951188kB mlocked:0kB kernel_stack:22400kB pagetables:37304kB bounce:0kB free_pcp:248kB local_pcp:0kB free_cma:0kB [ 783.437211] Node 0 Normal free:16920kB min:17320kB low:21648kB high:25976kB active_anon:570132kB inactive_anon:13452kB active_file:14756kB inactive_file:13888kB unevictable:0kB writepending:42740kB present:1048576kB managed:951188kB mlocked:0kB kernel_stack:22304kB pagetables:37304kB bounce:0kB free_pcp:312kB local_pcp:32kB free_cma:0kB [ 1242.729271] Node 0 Normal free:16920kB min:17320kB low:21648kB high:25976kB active_anon:570132kB inactive_anon:13452kB active_file:14072kB inactive_file:14304kB unevictable:0kB writepending:42740kB present:1048576kB managed:951188kB mlocked:0kB kernel_stack:22128kB pagetables:37304kB bounce:0kB free_pcp:440kB local_pcp:48kB free_cma:0kB [ 1412.248884] Node 0 Normal free:16920kB min:17320kB low:21648kB high:25976kB active_anon:570132kB inactive_anon:13452kB active_file:14332kB inactive_file:14280kB unevictable:0kB writepending:42740kB present:1048576kB managed:951188kB mlocked:0kB kernel_stack:22128kB pagetables:37304kB bounce:0kB free_pcp:440kB local_pcp:48kB free_cma:0kB [ 1549.795514] Node 0 Normal free:16920kB min:17320kB low:21648kB high:25976kB active_anon:570132kB inactive_anon:13452kB active_file:14416kB inactive_file:14272kB unevictable:0kB writepending:42740kB present:1048576kB managed:951188kB mlocked:0kB kernel_stack:22128kB pagetables:37304kB bounce:0kB free_pcp:440kB local_pcp:48kB free_cma:0kB Complete log is http://I-love.SAKURA.ne.jp/tmp/serial-20180306.txt.xz . Config is http://I-love.SAKURA.ne.jp/tmp/config-4.16-rc4 . -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@xxxxxxxxx. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@xxxxxxxxx"> email@xxxxxxxxx </a>