Re: KVM hang after OOM

Tetsuo Handa <penguin-kernel@xxxxxxxxxxxxxxxxxxx> · Tue, 13 Mar 2018 21:22:25 +0900

Mikhail Gavrilov wrote:
> On 12 March 2018 at 14:00, Kirill A. Shutemov <kirill@xxxxxxxxxxxxx> wrote:
> > On Sun, Mar 11, 2018 at 11:11:52PM +0500, Mikhail Gavrilov wrote:
> >> $ uname -a
> >> Linux localhost.localdomain 4.15.7-300.fc27.x86_64+debug #1 SMP Wed
> >> Feb 28 17:32:16 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux
> >>
> >>
> >> How reproduce:
> >> 1. start virtual machine
> >> 2. open https://oom.sy24.ru/ in Firefox which will helps occurred OOM.
> >> Sorry I can't attach here html page because my message will rejected
> >> as message would contained HTML subpart.
> >>
> >> Actual result virtual machine hang and even couldn't be force off.
> >>
> >> Expected result virtual machine continue work.
> >>
> >> [ 2335.903277] INFO: task CPU 0/KVM:7450 blocked for more than 120 seconds.
> >> [ 2335.903284]       Not tainted 4.15.7-300.fc27.x86_64+debug #1
> >> [ 2335.903287] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
> >> disables this message.
> >> [ 2335.903291] CPU 0/KVM       D10648  7450      1 0x00000000
> >> [ 2335.903298] Call Trace:
> >> [ 2335.903308]  ? __schedule+0x2e9/0xbb0
> >> [ 2335.903318]  ? __lock_page+0xad/0x180
> >> [ 2335.903322]  schedule+0x2f/0x90
> >> [ 2335.903327]  io_schedule+0x12/0x40
> >> [ 2335.903331]  __lock_page+0xed/0x180
> >> [ 2335.903338]  ? page_cache_tree_insert+0x130/0x130
> >> [ 2335.903347]  deferred_split_scan+0x318/0x340
> >
> > I guess it's bad idea to wait the page to be unlocked in the relaim path.
> > Could you check if this makes a difference:
> >
> > diff --git a/mm/huge_memory.c b/mm/huge_memory.c
> > index 87ab9b8f56b5..529cf36b7edb 100644
> > --- a/mm/huge_memory.c
> > +++ b/mm/huge_memory.c
> > @@ -2783,11 +2783,13 @@ static unsigned long deferred_split_scan(struct shrinker *shrink,
> >
> >         list_for_each_safe(pos, next, &list) {
> >                 page = list_entry((void *)pos, struct page, mapping);
> > -               lock_page(page);
> > +               if (!trylock_page(page))
> > +                       goto next;
> >                 /* split_huge_page() removes page from list on success */
> >                 if (!split_huge_page(page))
> >                         split++;
> >                 unlock_page(page);
> > +next:
> >                 put_page(page);
> >         }
> >
>
> Kiril,thanks for pay attention to the problem.
> But your patch couldn't help. Virtual machine was hang after OOM.
> New dmesg is attached.
>

Indeed, but the location of hungup seems to be different. dmesg.txt was
hanging at io_schedule() waiting for lock_page() and dmesg2.txt was
hanging at down_write(&mm->mmap_sem)/down_read(&mm->mmap_sem). But
dmesg3.txt was not hanging at io_schedule() waiting for lock_page().

What activities are performed between lock_page() and unlock_page()?
Do the activities (directly or indirectly) depend on __GFP_DIRECT_RECLAIM
memory allocation requests (e.g. GFP_NOFS/GFP_NOIO)? If yes, it will be
unsafe to call lock_page() unconditionally (i.e. without checking GFP
context where the shrinker function was called), won't it?