Re: How to profile 160 ms spent in `add_highpages_with_active_regions()`?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



[Removed non-working Pavel Tatashin <pasha.tatashin@xxxxxxxxxx>]

Dear Linux folks,


On 08/17/18 10:12, Paul Menzel wrote:

> With the merge of branch 'x86-timers-for-linus'
> of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip early time
> stamps are now printed.
> 
>> Early TSC based time stamping to allow better boot time analysis.
>>     This comes with a general cleanup of the TSC calibration code which
>> grew warts and duct taping over the years and removes 250 lines of
>> code. Initiated and mostly implemented by Pavel with help from various
>> folks
> 
> Looking at those early time stamps, in this case on an ASRock E350M1,
> there is a 160 ms delay in the code below.
> 
> Before:
> 
> ```
> [    0.000000] Initializing CPU#0
> [    0.000000] Initializing HighMem for node 0 (000373fe:000c7d3c)
> [    0.000000] Initializing Movable for node 0 (00000000:00000000)
> [    0.000000] Memory: 3225668K/3273580K available (8898K kernel code, 747K rwdata, 2808K rodata, 768K init, 628K bss, 47912K reserved, 0K cma-reserved, 2368760K highmem)
> ```
> 
> After:
> 
> ```
> [    0.063473] Initializing CPU#0
> [    0.063484] Initializing HighMem for node 0 (00036ffe:000c7d3c)
> [    0.229442] Initializing Movable for node 0 (00000000:00000000)
> [    0.236020] Memory: 3225728K/3273580K available (8966K kernel code, 750K rwdata, 2828K rodata, 776K init, 640K bss, 47852K reserved, 0K cma-reserved, 2372856K highmem)
> ```
> 
> The code in question is from `arch/x86/mm/highmem_32.c`.
> 
>> void __init set_highmem_pages_init(void)
>> {
>>         struct zone *zone;
>>         int nid;
>>
>>         /*
>>          * Explicitly reset zone->managed_pages because set_highmem_pages_init()
>>          * is invoked before free_all_bootmem()
>>          */
>>         reset_all_zones_managed_pages();
>>         for_each_zone(zone) {
>>                 unsigned long zone_start_pfn, zone_end_pfn;
>>
>>                 if (!is_highmem(zone))
>>                         continue;
>>
>>                 zone_start_pfn = zone->zone_start_pfn;
>>                 zone_end_pfn = zone_start_pfn + zone->spanned_pages;
>>
>>                 nid = zone_to_nid(zone);
>>                 printk(KERN_INFO "Initializing %s for node %d (%08lx:%08lx)\n",
>>                                 zone->name, nid, zone_start_pfn, zone_end_pfn);
>>
>>                 add_highpages_with_active_regions(nid, zone_start_pfn,
>>                                  zone_end_pfn);
>>         }
>> }
> 
> And in there, it seems to be the function below.
> 
>> void __init add_highpages_with_active_regions(int nid,
>>                          unsigned long start_pfn, unsigned long end_pfn)
>> {
>>         phys_addr_t start, end;
>>         u64 i;
>>
>>         for_each_free_mem_range(i, nid, MEMBLOCK_NONE, &start, &end, NULL) {
>>                 unsigned long pfn = clamp_t(unsigned long, PFN_UP(start),
>>                                             start_pfn, end_pfn);
>>                 unsigned long e_pfn = clamp_t(unsigned long, PFN_DOWN(end),
>>                                               start_pfn, end_pfn);
>>                 for ( ; pfn < e_pfn; pfn++)
>>                         if (pfn_valid(pfn))
>>                                 free_highmem_page(pfn_to_page(pfn));
> 
> Assuming the time stamps are correct, how can a profile that delay
> without adding print statements all over the place? Using ftrace
> doesn’t seem to work for me, probably because it’s that early.

I added print statements, and got the result below.

```
[    0.057109] Initializing CPU#0
[    0.057119] Initializing HighMem for node 0 (00036ffe:000c7d3c)
[    0.057122] add_highpages_with_active_regions: for_each_free_mem_range: i = 0
[    0.057124] add_highpages_with_active_regions: after for loop
[    0.057126] add_highpages_with_active_regions: for_each_free_mem_range: i = 1
[    0.057127] add_highpages_with_active_regions: after for loop
[    0.057128] add_highpages_with_active_regions: for_each_free_mem_range: i = 1
[    0.057129] add_highpages_with_active_regions: after for loop
[    0.057131] add_highpages_with_active_regions: for_each_free_mem_range: i = 1
[    0.057132] add_highpages_with_active_regions: after for loop
[    0.057133] add_highpages_with_active_regions: for_each_free_mem_range: i = 1
[    0.057134] add_highpages_with_active_regions: after for loop
[    0.057136] add_highpages_with_active_regions: for_each_free_mem_range: i = 1
[    0.057137] add_highpages_with_active_regions: after for loop
[    0.057138] add_highpages_with_active_regions: for_each_free_mem_range: i = 1
[    0.057140] add_highpages_with_active_regions: after for loop
[    0.057141] add_highpages_with_active_regions: for_each_free_mem_range: i = 1
[    0.057142] add_highpages_with_active_regions: after for loop
[    0.057143] add_highpages_with_active_regions: for_each_free_mem_range: i = 1
[    0.057145] add_highpages_with_active_regions: after for loop
[    0.057146] add_highpages_with_active_regions: for_each_free_mem_range: i = 1
[    0.057147] add_highpages_with_active_regions: after for loop
[    0.057149] add_highpages_with_active_regions: for_each_free_mem_range: i = 1
[    0.057150] add_highpages_with_active_regions: after for loop
[    0.057151] add_highpages_with_active_regions: for_each_free_mem_range: i = 1
[    0.057152] add_highpages_with_active_regions: after for loop
[    0.057154] add_highpages_with_active_regions: for_each_free_mem_range: i = 1
[    0.057155] add_highpages_with_active_regions: after for loop
[    0.057156] add_highpages_with_active_regions: for_each_free_mem_range: i = 1
[    0.057157] add_highpages_with_active_regions: after for loop
[    0.057159] add_highpages_with_active_regions: for_each_free_mem_range: i = 1
[    0.057160] add_highpages_with_active_regions: after for loop
[    0.057161] add_highpages_with_active_regions: for_each_free_mem_range: i = 1
[    0.057163] add_highpages_with_active_regions: after for loop
[    0.057164] add_highpages_with_active_regions: for_each_free_mem_range: i = 1
[    0.057165] add_highpages_with_active_regions: after for loop
[    0.057167] add_highpages_with_active_regions: for_each_free_mem_range: i = 2
[    0.222580] add_highpages_with_active_regions: after for loop
[    0.222583] Initializing Movable for node 0 (00000000:00000000)
[    0.222585] add_highpages_with_active_regions: for_each_free_mem_range: i = 0
[    0.222587] add_highpages_with_active_regions: after for loop
[    0.222588] add_highpages_with_active_regions: for_each_free_mem_range: i = 1
[    0.222589] add_highpages_with_active_regions: after for loop
[    0.222591] add_highpages_with_active_regions: for_each_free_mem_range: i = 1
[    0.222592] add_highpages_with_active_regions: after for loop
[    0.222593] add_highpages_with_active_regions: for_each_free_mem_range: i = 1
[    0.222595] add_highpages_with_active_regions: after for loop
[    0.222596] add_highpages_with_active_regions: for_each_free_mem_range: i = 1
[    0.222597] add_highpages_with_active_regions: after for loop
[    0.222599] add_highpages_with_active_regions: for_each_free_mem_range: i = 1
[    0.222600] add_highpages_with_active_regions: after for loop
[    0.222601] add_highpages_with_active_regions: for_each_free_mem_range: i = 1
[    0.222602] add_highpages_with_active_regions: after for loop
[    0.222604] add_highpages_with_active_regions: for_each_free_mem_range: i = 1
[    0.222605] add_highpages_with_active_regions: after for loop
[    0.222606] add_highpages_with_active_regions: for_each_free_mem_range: i = 1
[    0.222607] add_highpages_with_active_regions: after for loop
[    0.222609] add_highpages_with_active_regions: for_each_free_mem_range: i = 1
[    0.222610] add_highpages_with_active_regions: after for loop
[    0.222611] add_highpages_with_active_regions: for_each_free_mem_range: i = 1
[    0.222613] add_highpages_with_active_regions: after for loop
[    0.222614] add_highpages_with_active_regions: for_each_free_mem_range: i = 1
[    0.222615] add_highpages_with_active_regions: after for loop
[    0.222616] add_highpages_with_active_regions: for_each_free_mem_range: i = 1
[    0.222618] add_highpages_with_active_regions: after for loop
[    0.222619] add_highpages_with_active_regions: for_each_free_mem_range: i = 1
[    0.222620] add_highpages_with_active_regions: after for loop
[    0.222622] add_highpages_with_active_regions: for_each_free_mem_range: i = 1
[    0.222623] add_highpages_with_active_regions: after for loop
[    0.222624] add_highpages_with_active_regions: for_each_free_mem_range: i = 1
[    0.222625] add_highpages_with_active_regions: after for loop
[    0.222627] add_highpages_with_active_regions: for_each_free_mem_range: i = 1
[    0.222628] add_highpages_with_active_regions: after for loop
[    0.222629] add_highpages_with_active_regions: for_each_free_mem_range: i = 2
[    0.222630] add_highpages_with_active_regions: after for loop
```

So, the problem is with i = 2.


Kind regards,

Paul

Attachment: smime.p7s
Description: S/MIME Cryptographic Signature


[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux