Re: OOM killer hung the whole system

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 





On Mon, Jul 31, 2017 at 3:51 AM, Mulyadi Santosa <mulyadi.santosa@xxxxxxxxx> wrote:


On Tue, Jul 25, 2017 at 4:59 PM, Rock Lee <rockdotlee@xxxxxxxxx> wrote:
Hi, 

I was using a program to test OOM killer, but OOM killer hung my board.  The test snippet is very simple, just malloc() without free().

  6         int i=0;
  7         char *ptrtest=NULL;
  8          while (1)
  9          {
 10                    ptrtest = malloc(0x1000);
 11                    if (ptrtest == NULL)
 12                        printf("malloc failed.\n");
 13          }
 14          return 0;
 15 

After I executed this program(a.out), there is no reaction of my shell. Only this messege was shown:

[  180.138188] Out of memory: Kill process 2706 (qq) score 623 or sacrifice child
[  180.164230] Killed process 2706 (qq) total-vm:98156kB, anon-rss:96400kB, file-rss:4kB


I think the clue is above. Could you find out what is the process with pid 2706?

I bet that is the shell that runs as the parent of your a.out program, but feel free to make sure what that is.

Sorry, my bad, it is a spelling mistake. a.out is actually qq  (pid 2706)
 

 
I think it must be some problem with OOM killer, and tried a lot of way to get some infomation. And I got this:

=======================================================
Process: a.out, cpu: 0 pid: 1463 start: 0xd5849b00
=====================================================
    Task name: a.out pid: 1463 cpu: 0
    state: 0x2 exit_state: 0x0 stack base: 0xd6190000
    Stack:
    [<c08394e4>] __schedule+0x2c8
    [<c018df88>] squashfs_cache_get+0x108
    [<c0191020>] squashfs_readpage_block+0x28
    [<c018f648>] squashfs_readpage+0x624
    [<c0090d4c>] __do_page_cache_readahead+0x228
    [<c00882f4>] filemap_fault+0x1d4
    [<c00a5f10>] __do_fault+0x34
    [<c00a8ab0>] do_read_fault+0x19c
    [<c00a9388>] handle_mm_fault+0x468
    [<c00164b0>] do_page_fault+0x11c
    [<c00084a4>] do_PrefetchAbort+0x34
    [<c0011c1c>] ret_from_exception+0x0

I am suprised, because a.out is in /tmp, there is no relation between a.out and squashfs(my rootfs). Could anyone give me a hint ? Is the problem of squashfs or OOM killer. 

The above stack trace just mention what happened in kernel space. Very likely when it was about to be killled by OOM killer. CMIIW.

Thus, it has little or nothing to do with the fact that it called squash fs related function

There is a chance squashfs functions will be called when out of memory. It is when a libary file is mmaped, filemap_fault() gave me a hint.
 



--
regards,

Mulyadi Santosa
Freelance Linux trainer and consultant

blog: the-hydra.blogspot.com
training: mulyaditraining.blogspot.com

Virus-free. www.avast.com

_______________________________________________
Kernelnewbies mailing list
Kernelnewbies@kernelnewbies.org
https://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies




--
Cheers,
Rock
_______________________________________________
Kernelnewbies mailing list
Kernelnewbies@xxxxxxxxxxxxxxxxx
https://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies

[Index of Archives]     [Newbies FAQ]     [Linux Kernel Mentors]     [Linux Kernel Development]     [IETF Annouce]     [Git]     [Networking]     [Security]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux RAID]     [Linux SCSI]     [Linux ACPI]
  Powered by Linux