Re: zone state overhead

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, Oct 08, 2010 at 11:29:53PM +0800, Mel Gorman wrote:
> On Tue, Sep 28, 2010 at 01:08:01PM +0800, Shaohua Li wrote:
> > In a 4 socket 64 CPU system, zone_nr_free_pages() takes about 5% ~ 10% cpu time
> > according to perf when memory pressure is high. The workload does something
> > like:
> > for i in `seq 1 $nr_cpu`
> > do
> >         create_sparse_file $SPARSE_FILE-$i $((10 * mem / nr_cpu))
> >         $USEMEM -f $SPARSE_FILE-$i -j 4096 --readonly $((10 * mem / nr_cpu)) &
> > done
> > this simply reads a sparse file for each CPU. Apparently the
> > zone->percpu_drift_mark is too big, and guess zone_page_state_snapshot() makes
> > a lot of cache bounce for ->vm_stat_diff[]. below is the zoneinfo for reference.
> 
> Would it be possible for you to post the oprofile report? I'm in the
> early stages of trying to reproduce this locally based on your test
> description. The first machine I tried showed that zone_nr_page_state
> was consuming 0.26% of profile time with the vast bulk occupied by
> do_mpage_readahead. See as follows
> 
> 1599339  53.3463  vmlinux-2.6.36-rc7-pcpudrift do_mpage_readpage
> 131713    4.3933  vmlinux-2.6.36-rc7-pcpudrift __isolate_lru_page
> 103958    3.4675  vmlinux-2.6.36-rc7-pcpudrift free_pcppages_bulk
> 85024     2.8360  vmlinux-2.6.36-rc7-pcpudrift __rmqueue
> 78697     2.6250  vmlinux-2.6.36-rc7-pcpudrift native_flush_tlb_others
> 75678     2.5243  vmlinux-2.6.36-rc7-pcpudrift unlock_page
> 68741     2.2929  vmlinux-2.6.36-rc7-pcpudrift get_page_from_freelist
> 56043     1.8693  vmlinux-2.6.36-rc7-pcpudrift __alloc_pages_nodemask
> 55863     1.8633  vmlinux-2.6.36-rc7-pcpudrift ____pagevec_lru_add
> 46044     1.5358  vmlinux-2.6.36-rc7-pcpudrift radix_tree_delete
> 44543     1.4857  vmlinux-2.6.36-rc7-pcpudrift shrink_page_list
> 33636     1.1219  vmlinux-2.6.36-rc7-pcpudrift zone_watermark_ok
> .....
> 7855      0.2620  vmlinux-2.6.36-rc7-pcpudrift zone_nr_free_pages
> 
> The machine I am testing on is non-NUMA 4-core single socket and totally
> different characteristics but I want to be sure I'm going more or less the
> right direction with the reproduction case before trying to find a larger
> machine.
Here it is. this is a 4 socket nahalem machine.
           268160.00 57.2% _raw_spin_lock                      /lib/modules/2.6.36-rc5-shli+/build/vmlinux
            40302.00  8.6% zone_nr_free_pages                  /lib/modules/2.6.36-rc5-shli+/build/vmlinux
            36827.00  7.9% do_mpage_readpage                   /lib/modules/2.6.36-rc5-shli+/build/vmlinux
            28011.00  6.0% _raw_spin_lock_irq                  /lib/modules/2.6.36-rc5-shli+/build/vmlinux
            22973.00  4.9% flush_tlb_others_ipi                /lib/modules/2.6.36-rc5-shli+/build/vmlinux
            10713.00  2.3% smp_invalidate_interrupt            /lib/modules/2.6.36-rc5-shli+/build/vmlinux
             7342.00  1.6% find_next_bit                       /lib/modules/2.6.36-rc5-shli+/build/vmlinux
             4571.00  1.0% try_to_unmap_one                    /lib/modules/2.6.36-rc5-shli+/build/vmlinux
             4094.00  0.9% default_send_IPI_mask_sequence_phys /lib/modules/2.6.36-rc5-shli+/build/vmlinux
             3497.00  0.7% get_page_from_freelist              /lib/modules/2.6.36-rc5-shli+/build/vmlinux
             3032.00  0.6% _raw_spin_lock_irqsave              /lib/modules/2.6.36-rc5-shli+/build/vmlinux
             3029.00  0.6% shrink_page_list                    /lib/modules/2.6.36-rc5-shli+/build/vmlinux
             2318.00  0.5% __inc_zone_state                    /lib/modules/2.6.36-rc5-shli+/build/vmlinux
 

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@xxxxxxxxxx  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@xxxxxxxxx";> email@xxxxxxxxx </a>



[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux]     [Linux OMAP]     [Linux MIPS]     [ECOS]     [Asterisk Internet PBX]     [Linux API]