Re: CPU softlockup due to smp_call_function()

Sasha Levin <levinsasha928@xxxxxxxxx> · Thu, 5 Apr 2012 14:32:27 +0200



On Thu, Apr 5, 2012 at 2:24 PM, Avi Kivity <avi@xxxxxxxxxx> wrote:
> On 04/04/2012 11:12 PM, Sasha Levin wrote:
>> Hi all,
>>
>> I've starting seeing soft lockups resulting from smp_call_function()
>> calls. I've attached two different backtraces of this happening with
>> different code paths.
>>
>> This is running inside a KVM guest with the trinity fuzzer, using
>> today's linux-next kernel.
>>
>> [ 6540.134009] BUG: soft lockup - CPU#0 stuck for 22s! [kworker/u:1:38]
>> [ 6540.134048] irq event stamp: 286811770
>> [ 6540.134048] hardirqs last  enabled at (286811769):
>> [<ffffffff82669e74>] restore_args+0x0/0x30
>> [ 6540.134048] hardirqs last disabled at (286811770):
>> [<ffffffff8266b3ea>] apic_timer_interrupt+0x6a/0x80
>> [ 6540.134048] softirqs last  enabled at (286811768):
>> [<ffffffff810b746e>] __do_softirq+0x16e/0x190
>> [ 6540.134048] softirqs last disabled at (286811749):
>> [<ffffffff8266bdec>] call_softirq+0x1c/0x30
>> [ 6540.134048] CPU 0
>> [ 6540.134048] Pid: 38, comm: kworker/u:1 Tainted: G        W
>> 3.4.0-rc1-next-20120404-sasha-dirty #72
>> [ 6540.134048] RIP: 0010:[<ffffffff8111f30e>]  [<ffffffff8111f30e>]
>> smp_call_function_many+0x27e/0x2a0
>>
>
> This cpu is waiting for some other cpu to process a function (likely
> rps_trigger_softirq(), from the trace).  Can you get a backtrace on all
> cpus when this happens?
>
> It would be good to enhance smp_call_function_*() to do this
> automatically when it happens - it's spinning there anyway, so it might
> as well count the iterations and NMI the lagging cpu if it waits for too
> long.

What do you think about modifying the softlockup detector to NMI all
CPUs if it's going to panic because it detected a lockup?
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html