On 04/26/2012 10:03 AM, Arun KS wrote: > Hi Srivatsa, > > On Wed, Apr 25, 2012 at 3:56 PM, Srivatsa S. Bhat > <srivatsa.bhat@xxxxxxxxxxxxxxxxxx> wrote: >> On 04/25/2012 03:36 AM, Philipp Ittershagen wrote: >> >>> Hi Devendra, >>> >>> On Tue, Apr 24, 2012 at 03:24:23PM +0530, devendra rawat wrote: >>>> Hi, >>>> A switch driver is causing soft lockup on Montavista Linux Kernel >>>> 2.6.10 system. >>>> While browsing through the code of the driver. I came across a snippet >>>> where after disabling the interrupts >>>> a call is made to interruptible_sleep_on_timeout(). >>>> The code snippet is like >>>> cli(); >>>> init_waitqueue_head(&queue); >>>> interruptible_sleep_on_timeout(&queue, USEC_TO_JIFFIES(usec)); >>>> thread_check_signals(); >>>> sti(); >>>> I need to know the side effect of this sort of code, can it be >>>> responsible for the softlockup of the system ? Its a PowerPC based >>>> system. >>> >>> you cannot call sleep functions after disabling interrupts, because no >>> interrupt will arrive for the scheduler to see the timeout and resume your >>> task. >>> >> >> >> Yes, that's right. Also, in general, sleeping inside atomic sections (eg., >> sections with interrupts disabled or preempt disabled) is wrong. There is a >> config option in the kernel that you can use to enable >> sleep-inside-atomic-section-checking (CONFIG_DEBUG_ATOMIC_SLEEP I believe), >> which can help you pin-point such bugs easily. > > I tired an experiment to check this. > > /* disable interrupts and preemption */ > spin_lock_irqsave(&lock, flags); > /* enable preemption, but interrupt still disabled */ > spin_unlock(&lock); > /* Now schedule something else */ > schedule_timeout(10 * HZ); > > But this is not causing any harm. I m able to call schedule with > interrupt disabled and system works fine afterwards. > > So when I looked inside the schedule() function, it checks only > whether preemption is disabled or not. schedule calls BUG() only if > preemption is disabled and not if interrupts are disabled. > > And AFAIK there is no fuction inside the kernel which tells you that > interrupt are disabled. > > So explantion why system works fine after calling a schedule with > interrupt disabled go here, > > There is a raw_spin_lock_irq(&rq->lock) inside the __schedule() which > in turn calls local_irq_disable(). > > local_irq_disable/enable() functions are not nested. We dont have > reference counting. > One call to local_irq_enable is enough to enable multiple calls of > local_irq_disable(). > > So my inference is that if you call a schedule with interrupt disable > will not cause any problem. Because schedule function enable it back > before we really schedules out. > But call to schedule() with preemtion disabled will end up in famous > BUG scheduling while atomic. > Indeed, you are right! And your experiment and analysis is perfect too! Sorry for the confusion - I had used the term "atomic" quite loosely. But your careful experiment of just re-enabling preemption, while still keeping the interrupts disabled was a very good one! And to add to what you said above, the __schedule() also does a preempt_enable() to re-enable preemption (which it had disabled at the beginning). But since preempt_disable() can nest, if we had called __schedule() with preemption already disabled, then we end up in trouble - and hence the BUG is fired in such cases. Thanks for the clarification! Regards, Srivatsa S. Bhat _______________________________________________ Kernelnewbies mailing list Kernelnewbies@xxxxxxxxxxxxxxxxx http://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies