Re: Boot time: Optimize CPU bring up?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Dirk Behme <dirk.behme <at> de.bosch.com> writes:

> 
> Hi,
> 
> on a ARMv7 Freescale i.MX6 based system we are looking at optimizing the 
> kernel boot time. Booting a 3.5.7 kernel with SMP=y and the kernel 
> option 'nosmp' (the i.MX6 has single, dual and quad CPU versions) we get
> 
> [    0.255927] hw perfevents: enabled with ARMv7 Cortex-A9 PMU driver, 7 
> counters available
> [    0.256033] Setting up static identity map for 0x10426a28 - 0x10426a80
> [    0.260204] initcall spawn_ksoftirqd+0x0/0x58 returned 0 after 9765 usecs
> [    0.270363] initcall init_workqueues+0x0/0x39c returned 0 after 9765 
> usecs
> [    0.290265] initcall cpu_stop_init+0x0/0xd0 returned 0 after 19531 usecs
> [    0.310449] initcall rcu_spawn_kthreads+0x0/0xc0 returned 0 after 
> 19531 usecs
> [    0.310699] Brought up 1 CPUs
> [    0.310712] SMP: Total of 1 processors activated (1581.05 BogoMIPS).
> 
> I.e. ~55ms just for bringing up the 1 CPU.
> 
> Looking into some details, e.g. cpu_stop_init(), the ~19531 usecs are 
> there because the system 'hangs' 2 jiffies (CONFIG_HZ=100) in 
> cpu_v7_do_idle().
> 
> For testing purposes switching to CONFIG_HZ=1000 reduces above 54ms to 
> just ~4ms. But we are unsure to switch the whole system to 
> CONFIG_HZ=1000 just to optimize this part of the boot process.
> 
> Does anybody know why all the above parts are idling for some jiffies? 
> Is there any other optimization than CONFIG_HZ=1000 possible?
> 
> In case there are any patches floating around or this was already 
> discussed, any link would be nice.
> 
> Many thanks and best regards
> 
> Dirk
> 

Hi Dirk,

I have done some analysis to find out where it is idling for some jiffies
during kernel boot. I have compiled my findings below and also suggested
some solutions. Please take a look

starting from start_kernel 
==========================
0) scheduler init routine creates idle thread
1) rest_init() is called from start_kernel()
2  2 threads are created kernel_init and kthreadd
3) schedule() called for first time
4) kthreadd scheduled first by cfs scheduler
5) kthreadd looks if there is any new thread to be created
6) since there is no new thread to be created so it set state
TASK_INTERRUPTIBLE and calls schedule()
7) now kernel_init is picked by scheduler
8) kernel_init calls do_pre_smp_initcalls
9) when spawn_ksoftirqd initcall is run by kernel init thread, it goes to
create a kthread(ksoftirqd)
10) for this it wakes up kthreadd
11) a new kthread is created by kthreadd
12) kthreadd looks if any other kthread to be created in create list
13) since there is no kthread to be created so it calls scheduler with state
TASK_INTERRUPTIBLE
14) now scheduler picks kernel init thread to br executed next
15) kernel init thread calls wait_for_completion(&create.done) which is
supposed to be set by newly created kthread.
16) schedule timeout occurs and scheduler next picks newly created kthread.
17) newly created kthread sets state TASK_UNINTERRUPTIBLE and sends complete
signal create.done 
18) scheduler then schedules kernel init thread which goes to bind this
newly created kthread to cpu0 using kbind().
19) kernel init thread checks if newly created kthread has been dequeued.
20) if not dequeued yet, it means its still in run queue.
21) if 20 is yes then kernel init goes to sleep 

for HZ=100, it sleeps for 10ms while for HZ=1000, it sleeps for 1ms

                if (unlikely(on_rq)) {
                        ktime_t to = ktime_set(0, NSEC_PER_SEC/HZ);
                        set_current_state(TASK_UNINTERRUPTIBLE);
                        schedule_hrtimeout(&to, HRTIMER_MODE_REL);
                        continue;
                }

22) Now scheduler picks newly created kthread from run queue.
23) this kthread is dequeues by calling schedule with TASK_UNINTERRUPTIBLE
state set
24) Now at this time all the kthreads are in dequeued state except idle i.e

kthreadd --> dequeued
kernel init --> dequeued
kthread (ksoftirqd) --> dequeued

25) Nothing left on run queue so scheduler calls idle thread.
26) Now system runs in idle thread until 1 timer tick is received i.e for
10ms (HZ=100)
27) when 1 timer tick is received, system get out from idle thread
28) scheduler wakes up kernel init thread 
29) kernel init thread again checks if kthread (ksoftirqd) has been dequeued 
30) if it founds it dequeued, it finally binds it with cpu0
31) kernel init continues execution and kthread (ksoftirqd) is woken up and
put on run queue again for cpu0.
32) kthread gets scheduled later and run_ksoftirqd() is called then.

So 1 timer tick is always consumed in idle loop in step 21 while booting
whenever there is a need to bind a kthread to cpu. The same thing happens with
early initcalls init_workqueues and cpu_stop_init causing the boot time to
increase.

Possible solutions to decrease the boot time:
==============================================

--------------------------------------------------------
1) Increasing CONFIG_HZ=100 (default) to CONFIG)_HZ=1000
--------------------------------------------------------

Increasing the timer ticks i.e HZ=1000 which will cause step 21 to sleep for
only 1ms.

draw backs: increasing the timer ticks decreases throughput.

OR 

--------------------------------
2) use CONFIG_PREEMPT_VOLUNTARY:
--------------------------------

draw backs: Reduces the maximum latency of rescheduling at the cost of
slightly lower throughput.

OR

-------------------------------------------------------------------
3) Don't wait before binding a kthread to cpu0 in presmp initcalls: 
-------------------------------------------------------------------

Since we are running presmp initcalls(only cpu0 is up), why do we need to
wait before binding a kthread to cpu0?
Souldn't we skip this wait like it is done in !SMP case (assuming that
presmp just equals !SMP)

#ifdef CONFIG_SMP
void scheduler_ipi(void);
extern unsigned long wait_task_inactive(struct task_struct *, long match_state);
#else
static inline void scheduler_ipi(void) { }
static inline unsigned long wait_task_inactive(struct task_struct *p,
                                               long match_state)
{
        return 1;
}
#endif

Above piece of code does not wait in case of !SMP. I beleive that same
should be done for presmp initcalls. But i haven't found any such
implementations
in code.

draw backs: in my opinion there shouldn't be any. Comments needed.

OR

-----------------------------------------------------------
4) Don't sleep for 1 tick before binding a kthread to cpu0:
-----------------------------------------------------------

1 timer tick is always consumed in idle loop in step 21 while booting
whenever there is a need to bind a kthread to cpu. Why do we need to sleep
kernel init thread for 1 tick ? why don't we just call the schedule() in
wait_task_inactive(), yeild the cpu so that kthread can get scheduled ASAP
and gets dequeued. kernel init thread will be scheduled next ASAP and it
will get out of wait_task_inactive (Does'nt it seem quicker as compared to
sleeping for 1 tick?)

                if (unlikely(on_rq)) {
                        //ktime_t to = ktime_set(0, NSEC_PER_SEC/HZ);
                        //set_current_state(TASK_UNINTERRUPTIBLE);
                        //schedule_hrtimeout(&to, HRTIMER_MODE_REL);
			schedule();
                        continue;
                }

Above changes will cause step 21 to never sleep.

or (if sleeping is necessary)

Modifying the above code to sleep for smaller time (1ms in case of HZ=100)
but for this i guess we will need hrtimer to work at that time. But it was
found that hrtimer does not work at that time for any resolution higher that
10ms(in case of HZ=100). 

draw backs: not known yet. Comments needed.

It would be very nice if we can get some comments on above analysis from
community.

Thanks .
Abbas Raza


--
To unsubscribe from this list: send the line "unsubscribe linux-embedded" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html




[Index of Archives]     [Gstreamer Embedded]     [Linux MMC Devel]     [U-Boot V2]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux ARM Kernel]     [Linux OMAP]     [Linux SCSI]

  Powered by Linux