Hi Peter,
As last mentioned on mail, we are still seeing issue with the latest
approach and below is the susceptible race as mentioned earlier..
controller Thread CPUHP Thread
takedown_cpu
kthread_park
kthread_parkme
Set KTHREAD_SHOULD_PARK
smpboot_thread_fn
set Task interruptible
wake_up_process
if (!(p->state & state))
goto out;
Kthread_parkme
SET TASK_PARKED
schedule
raw_spin_lock(&rq->lock)
ttwu_remote
waiting for __task_rq_lock
context_switch
finish_lock_switch
Case TASK_PARKED
kthread_park_complete
SET Running
So it seems issue is still their with the latest mentioned fix
kthread, sched/wait: Fix kthread_parkme() completion issue.
Regards
Gaurav
On 5/7/2018 4:53 PM, Kohli, Gaurav wrote:
Corrected the formatting, Sorry for spam.
HI Peter,
We have tested with new patch and still seeing same issue, in this
dumps we don't have debug traces, but seems there still exist race
from code review , Can you please check it once:
Controller Thread CPUHP Thread
takedown_cpu
kthread_park
kthread_parkme
Set KTHREAD_SHOULD_PARK
smpboot_thread_fn
set Task interruptible
wake_up_process
Kthread_parkme
SET TASK_PARKED
schedule
raw_spin_lock(&rq->lock)
context_switch
finish_lock_switch
Case TASK_PARKED
kthread_park_complete
SET TASK_INTERRUPTIBLE
And also seeing the same warning during unpark of cpuhp from controller:
if (!wait_task_inactive(p, state)) {
WARN_ON(1);
return;
}
325.065893] [<ffffff8920ed0200>] kthread_unpark+0x80/0xd8
[ 325.065902] [<ffffff8920eab754>] bringup_cpu+0xa0/0x12c
[ 325.065910] [<ffffff8920eaae90>] cpuhp_invoke_callback+0xb4/0x5c8
[ 325.065917] [<ffffff8920eabd98>] cpuhp_up_callbacks+0x3c/0x154
[ 325.065924] [<ffffff8920ead220>] _cpu_up+0x134/0x208
[ 325.065931] [<ffffff8920ead45c>] do_cpu_up+0x168/0x1a0
[ 325.065938] [<ffffff8920ead4b8>] cpu_up+0x24/0x30
[ 325.065948] [<ffffff89215b1408>] cpu_subsys_online+0x20/0x2c
[ 325.065956] [<ffffff89215aac64>] device_online+0x70/0xb4
[ 325.065962] [<ffffff89215aad78>] online_store+0xd0/0xdc
[ 325.065971] [<ffffff89215a7424>] dev_attr_store+0x40/0x54
[ 325.065982] [<ffffff89210d8a98>] sysfs_kf_write+0x5c/0x74
[ 325.065988] [<ffffff89210d7b9c>] kernfs_fop_write+0xcc/0x1ec
[ 325.065999] [<ffffff8921049288>] vfs_write+0xb4/0x1d0
[ 325.066006] [<ffffff892104a858>] SyS_write+0x60/0xc0
[ 325.066014] [<ffffff8920e83770>] el0_svc_naked+0x24/0x28
And after this same crash occured:
[ 325.521307] [<ffffff8920ed4aac>] smpboot_thread_fn+0x26c/0x2c8
[ 325.527295] [<ffffff8920ecfb24>] kthread+0xf4/0x108
I will put more debug ftraces to check what is going on exactly.
Regards
Gaurav
--
Qualcomm India Private Limited, on behalf of Qualcomm Innovation Center,
Inc. is a member of the Code Aurora Forum,
a Linux Foundation Collaborative Project.
--
To unsubscribe from this list: send the line "unsubscribe linux-arm-msm" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html