Re: [BUG]: when printk too more through serial, cpu up is failed.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, Mar 15, 2013 at 09:26:26AM +0800, Shuge wrote:
> 于 2013年03月14日 22:05, Greg KH 写道:
> >On Thu, Mar 14, 2013 at 09:51:34PM +0800, Shuge wrote:
> >>Hi all,
> >>     When the kernel printk too many log, the cpu is failed to come online.
> >>The problem is this:
> >>For example, cpu0 bring up cpu1:
> >>
> >>       a. cpu0 call cpu_up:
> >>          cpu_up()
> >>          ->_cpu_up()
> >>             ->__cpu_notify(CPU_UP_PREPARE)
> >>             ->__cpu_up()
> >>                ->boot_secondary()
> >>#       ->wait_for_completion_timeout(&cpu_running, msecs_to_jiffires(1000))
> >>          -> if (!cpu_online(cpu)) {
> >>                   pr_crit("CPU%u: failed to come online\n", cpu);
> >>                   ret = -EIO;
> >>               }
> >>          ->cpu_notify(CPU_ONLINE)
> >>
> >>       b. cpu1 enter kernel:
> >>       secondary_start_kernel()
> >>@   ->printk("CPU%u: Booted secondary processor\n", cpu)
> >>*   ->calibrate_delay()
> >>       ->set_cpu_online()
> >>       ->complete(cpu_running)
> >>         ->cpumask_set_cpu()
> >>
> >>    While cpu0 run to mark #,  which wait that cpu1 complete
> >>cpu_running, and set online.
> >>Generally, cpu0 can get it. But if the __log_buf is too large or
> >>other threads write
> >>it unceasing, then cpu1 come to mark @ or * in this moment. Cpu1 is
> >>busy outputing
> >>buffer, which cost time more than 1s, and cpu1 have not join in
> >>sched, so cpu0 wait it timeout.
> >>    By reading printk.c, I found that can_use_console() always return
> >>true, which be called by
> >>console_trylock_for_printk(). Because, have_callable_console()
> >>return ture always, if the console
> >>driver set CON_ANYTIME flag. I think that cpu should not output the
> >>__log_buf in coming online,
> >>even though have_callable_console() is true.
> >>
> >>/*
> >>  * Can we actually use the console at this time on this cpu?
> >>  *
> >>  * Console drivers may assume that per-cpu resources have
> >>  * been allocated. So unless they're explicitly marked as
> >>  * being able to cope (CON_ANYTIME) don't call them until
> >>  * this CPU is officially up.
> >>  */
> >>static inline int can_use_console(unsigned int cpu)
> >>{
> >>     return cpu_online(cpu) || have_callable_console();
> >>}
> >>
> >>In can_use_console, why not is &&, but ||?
> >>
> >>Kernel Version: 3.3.0
> >Why such an old and obsolete kernel version?  Please try this on 3.8,
> >lots of work have gone into the printk area that should have solved this
> >issue.
> >
> >greg k-h
> 
> I saw the printk.c in version 3.9, it still check
> console_trylock_for_printk() to decide to call console_unlock. In
> vprintk_emit(), cpu1 also have the opportunity to execute
> console_unlock() at coming online time.
> Once cpu which is coming online can output buffer, nothing can
> interrupt it until buffer is empty.But we can't ensure that none
> always write the __log_buf. It is danger!

Do you really hit this with a real system?  Is your cpu just really slow
in initializing?  What is the actual time it takes?

> I think, the solution is that we should prevent to use console at
> coming online.

Ok, what would be your proposed patch to solve this?

greg k-h
--
To unsubscribe from this list: send the line "unsubscribe linux-serial" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [Kernel Newbies]     [Security]     [Netfilter]     [Bugtraq]     [Linux PPP]     [Linux FS]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Samba]     [Video 4 Linux]     [Linmodem]     [Device Mapper]     [Linux Kernel for ARM]

  Powered by Linux