Re: 3.0rc2 oops in mem_cgroup_from_task

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, 9 Jun 2011 18:30:49 -0700 (PDT)
Hugh Dickins <hughd@xxxxxxxxxx> wrote:

> On Fri, 10 Jun 2011, KAMEZAWA Hiroyuki wrote:
> > On Thu, 9 Jun 2011 16:42:09 -0700
> > Ying Han <yinghan@xxxxxxxxxx> wrote:
> > 
> > > ++cc Hugh who might have seen similar crashes on his machine.
> 
> Yes, I was testing my tmpfs changes, and saw it on i386 yesterday
> morning.  Same trace as Dave's (including khugepaged, which may or
> may not be relevant), aside from the i386/x86_64 differences.
> 
> BUG: unable to handle kernel paging request at 6b6b6b87
> 
> I needed to move forward with other work on that laptop, so just
> jotted down the details to come back to later.  It came after one
> hour of building swapping load in memcg, I've not tried again since.
> 
> > 
> > Thank you for forwarding. Hmm. It seems the panic happens at khugepaged's 
> > page collapse_huge_page().
> 
> Yes, the inlining in my kernel was different,
> so collapse_huge_page() showed up in my backtrace.
> 
> > 
> > ==
> >         count_vm_event(THP_COLLAPSE_ALLOC);
> >         if (unlikely(mem_cgroup_newpage_charge(new_page, mm, GFP_KERNEL))) {
> > ==
> > It passes target mm to memcg and memcg gets a cgroup by
> > ==
> >  mem = mem_cgroup_from_task(rcu_dereference(mm->owner));
> > ==
> > Panic here means....mm->owner's task_subsys_state contains bad pointer ?
> 
> 781cc621 <mem_cgroup_from_task>:
> 781cc621:	55                   	push   %ebp
> 781cc622:	31 c0                	xor    %eax,%eax
> 781cc624:	89 e5                	mov    %esp,%ebp
> 781cc626:	8b 55 08             	mov    0x8(%ebp),%edx
> 781cc629:	85 d2                	test   %edx,%edx
> 781cc62b:	74 09                	je     781cc636 <mem_cgroup_from_task+0x15>
> 781cc62d:	8b 82 fc 08 00 00    	mov    0x8fc(%edx),%eax
> 781cc633:	8b 40 1c             	mov    0x1c(%eax),%eax   <==========
> 781cc636:	c9                   	leave  
> 781cc637:	c3                   	ret    
> 

then, access to task->cgroups->subsys[?] causes access to 6b6b6b87...

Then, task->cgroups or task->cgroups->subsys contains bad pointer.
Considering khugepaged, it grabs mm_struct and memcg make an access to
(mm->owner)->cgroups->subsys.

Then, from memcg's point of view, we need to doubt mm->owner is valid or not
for this kind of tasks.

Thank you for inputs.

-Kame







--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@xxxxxxxxxx  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@xxxxxxxxx";> email@xxxxxxxxx </a>


[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux]     [Linux OMAP]     [Linux MIPS]     [ECOS]     [Asterisk Internet PBX]     [Linux API]