Re: sandy bridge kswapd0 livelock with pagecache

Shaohua Li <shaohua.li@xxxxxxxxx> · Fri, 24 Jun 2011 14:33:04 +0800

On Tue, 2011-06-21 at 22:23 +0800, Pádraig Brady wrote:
> On 21/06/11 14:07, Mel Gorman wrote:
> > On Tue, Jun 21, 2011 at 12:59:00PM +0100, P?draig Brady wrote:
> >> On 21/06/11 12:34, Mel Gorman wrote:
> >>> On Tue, Jun 21, 2011 at 11:47:35AM +0100, P?draig Brady wrote:
> >>>> On 21/06/11 11:39, Mel Gorman wrote:
> >>>>> On Tue, Jun 21, 2011 at 10:53:02AM +0100, P?draig Brady wrote:
> >>>>>> I tried the 2 patches here to no avail:
> >>>>>> http://marc.info/?l=linux-mm&m=130503811704830&w=2
> >>>>>>
> >>>>>> I originally logged this at:
> >>>>>> https://bugzilla.redhat.com/show_bug.cgi?id=712019
> >>>>>>
> >>>>>> I can compile up and quickly test any suggestions.
> >>>>>>
> >>>>>
> >>>>> I recently looked through what kswapd does and there are a number
> >>>>> of problem areas. Unfortunately, I haven't gotten around to doing
> >>>>> anything about it yet or running the test cases to see if they are
> >>>>> really problems. In your case, the following is a strong possibility
> >>>>> though. This should be applied on top of the two patches merged from
> >>>>> that thread.
> >>>>>
> >>>>> This is not tested in any way, based on 3.0-rc3
> >>>>
> >>>> This does not fix the issue here.
> >>>>
> >>>
> >>> I made a silly mistake here.  When you mentioned two patches applied,
> >>> I assumed you meant two patches that were finally merged from that
> >>> discussion thread instead of looking at your linked mail. Now that I
> >>> have checked, I think you applied the SLUB patches while the patches
> >>> I was thinking of are;
> >>>
> >>> [afc7e326: mm: vmscan: correct use of pgdat_balanced in sleeping_prematurely]
> >>> [f06590bd: mm: vmscan: correctly check if reclaimer should schedule during shrink_slab]
> >>>
> >>> The first one in particular has been reported by another user to fix
> >>> hangs related to copying large files. I'm assuming you are testing
> >>> against the Fedora kernel. As these patches were merged for 3.0-rc1, can
> >>> you check if applying just these two patches to your kernel helps?
> >>
> >> These patches are already present in my 2.6.38.8-32.fc15.x86_64 kernel :(
> >>
> > 
> > Would it be possible to record a profile while it is livelocked to check
> > if it's stuck in this loop in shrink_slab()?
> 
> I did:
> 
> perf record -a -g sleep 10
> perf report --stdio > livelock.perf #attached
> perf annotate shrink_slab -k rpmbuild/BUILD/kernel-2.6.38.fc15/linux-2.6.38.x86_64/vmlinux > shrink_slab.annotate #attached
> 
> > 
> >                 while (total_scan >= SHRINK_BATCH) {
> >                         long this_scan = SHRINK_BATCH;
> >                         int shrink_ret;
> >                         int nr_before;
> > 
> >                         nr_before = do_shrinker_shrink(shrinker, shrink, 0);
> >                         shrink_ret = do_shrinker_shrink(shrinker, shrink,
> >                                                         this_scan);
> >                         if (shrink_ret == -1)
> >                                 break;
> >                         if (shrink_ret < nr_before)
> >                                 ret += nr_before - shrink_ret;
> >                         count_vm_events(SLABS_SCANNED, this_scan);
> >                         total_scan -= this_scan;
> > 
> >                         cond_resched();
> >                 }
> 
> shrink_slab() looks to be the culprit, but it seems
> to be the loop outside the above that is spinning.
> 
> > Also, can you post the output of sysrq+m at a few different times while
> > kswapd is spinning heavily? I want to see if all_unreclaimable has been
> > set on zones with a reasonable amount of memory. If they are, it's
> > possible for kswapd to be in a continual loop calling shrink_slab() and
> > skipping over normal page reclaim because all_unreclaimable is set
> > everywhere until a page is freed.
> 
> I did that 3 times. Attached.
from the perf log:
    62.70%          kswapd0  [i915]                              [k]
i915_gem_object_bind_to_gtt
                    |
                    --- i915_gem_object_bind_to_gtt
                       |          
                       |--99.98%-- shrink_slab
                       |          kswapd

Maybe a graphics driver bug. shrink_slab tries to free memory, but
i915_gem_object_bind_gtt could do memory allocation, IIRC.

Thanks,
Shaohua

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@xxxxxxxxx.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@xxxxxxxxx";> email@xxxxxxxxx </a>