Hi, On 4/11/19 5:10 AM, Michael Schmitz wrote: [...]
OK, I decided to bite the bullet and modify bus_error030() to allow falling through to do_page_fault if an invalid page read happens while page faults are disabled.
[...]
Resulting syslog:
[...] Summary: * Stack is always shown, but call trace following it is always empty. Is call trace explicitly disabled for m68k task list? * Following threads didn't fault: ----------------------------------------------------------------
[31197.540000] task PC stack pid father [31197.550000] init S 0 1 0 0x00000000 [31197.620000] kthreadd S 0 2 0 0x00000000
>> [31198.020000] ksoftirqd/0 R running task 0 7 2 >> [31198.080000] kdevtmpfs S 0 8 2 0x00000000 >> [31198.280000] oom_reaper S 0 12 2 0x00000000 >> [31198.760000] kswapd0 S 0 200 2 0x00000000 >> [31198.950000] jbd2/hda3-8 S 0 794 2 0x00000000 >> [31199.120000] portmap S 0 982 1 0x00000000 >> [31199.180000] syslogd S 0 1070 1 0x00000000 >> [31199.220000] klogd R running task 0 1076 1 >> [31199.280000] gpm S 0 1086 1 0x00000000 >> [31199.350000] inetd S 0 1091 1 0x00000000 >> [31199.410000] lpd S 0 1095 1 0x00000000 >> [31199.470000] sshd S 0 1101 1 0x00000000 >> [31199.520000] rpc.statd S 0 1106 1 0x00000000 >> [31199.560000] atd S 0 1111 1 0x00000000 >> [31199.610000] cron S 0 1114 1 0x00000000 >> [31199.670000] getty S 0 1120 1 0x00000000 >> [31199.720000] getty S 0 1121 1 0x00000000 >> [31199.790000] getty S 0 1122 1 0x00000000 >> [31199.850000] getty S 0 1123 1 0x00000000 >> [31199.920000] getty S 0 1124 1 0x00000000 >> [31199.980000] getty S 0 1125 1 0x00000000 >> [31200.160000] sshd S 0 1304 1101 0x00000000 >> [31200.220000] sshd S 0 1306 1304 0x00000000 >> [31200.270000] bash S 0 1307 1306 0x00000000 >> [31200.330000] bash R running task 0 1308 1307 ---------------------------------------------------------------- * Following threads did fault: ----------------------------------------------------------------
[31197.680000] kworker/0:0 I 0 3 2 0x00000000 [31197.750000] Workqueue: (null) (events) [31197.800000] kworker/0:0H I 0 4 2 0x00000000 [31197.930000] mm_percpu_wq I 0 6 2 0x00000000 [31198.160000] kworker/u2:1 I 0 9 2 0x00000000 [31198.230000] Workqueue: (null) (events_unbound) [31198.330000] kworker/0:1 I 0 13 2 0x00000000 [31198.450000] writeback I 0 95 2 0x00000000 [31198.510000] Workqueue: (null) (flush-3:0) [31198.570000] crypto I 0 97 2 0x00000000 [31198.660000] kblockd I 0 99 2 0x00000000 [31198.720000] Workqueue: (null) (kblockd) [31198.820000] kworker/0:1H I 0 761 2 0x00000000 [31198.890000] Workqueue: (null) (kblockd) [31199.010000] ext4-rsv-conver I 0 795 2 0x00000000 [31200.050000] kworker/u2:0 I 0 1272 2 0x00000000 [31200.390000] kworker/u2:2 I 0 1310 2 0x00000000
>> [31200.460000] Workqueue: (null) (events_unbound) ---------------------------------------------------------------- => *All* of them are kernel threads (kthreadd children) in 'I' state ('I' = interrupt context?) * *There are always two faults* Looking at the kthread_probe_data() code: ---------------------------------------------------------------- void *kthread_probe_data(struct task_struct *task) { struct kthread *kthread = to_kthread(task); void *data = NULL; probe_kernel_read(&data, &kthread->data, sizeof(data)); return data; } ---------------------------------------------------------------- And void print_worker_info() code: ---------------------------------------------------------------- void print_worker_info(const char *log_lvl, struct task_struct *task) { work_func_t *fn = NULL; char name[WQ_NAME_LEN] = { }; char desc[WORKER_DESC_LEN] = { }; struct pool_workqueue *pwq = NULL; struct workqueue_struct *wq = NULL; struct worker *worker; ... worker = kthread_probe_data(task); ... probe_kernel_read(&fn, &worker->current_func, sizeof(fn)); probe_kernel_read(&pwq, &worker->current_pwq, sizeof(pwq)); probe_kernel_read(&wq, &pwq->wq, sizeof(wq)); probe_kernel_read(name, wq->name, sizeof(name) - 1); probe_kernel_read(desc, worker->desc, sizeof(desc) - 1); ... if (fn || name[0] || desc[0]) { printk("%sWorkqueue: %s %pf", log_lvl, name, fn); if (strcmp(name, desc)) pr_cont(" (%s)", desc); pr_cont("\n"); } } ---------------------------------------------------------------- From the task output, we know that faulting items with workqueue have empty "name" & "desc", but non-NULL "current_func". Everything is initialized to NULLs, so if data fetch fails, values are NULLs. From backtraces, we know that at least one of the 2 faults is from probe_kernel_read(). If variable would be NULL: * task/kthread -> lots of faults * worker -> 3 faults, two in probe, and one in above function * current_func / fn -> no issues * current_pwq / pwq -> 2 faults, one from probe * wq -> 1 fault in above function * name/desc -> can't be NULL => I think the problem is that 'I' kthreads have NULL "current_pwq". Ones with workqueues just have "current_func" set, others don't. Why that would affect / fault only on 030? Attached patch fixes the Oops for me. - Eero
diff --git a/kernel/workqueue.c b/kernel/workqueue.c index ddee541ea97a..ec4127c0f3da 100644 --- a/kernel/workqueue.c +++ b/kernel/workqueue.c @@ -4582,8 +4582,11 @@ void print_worker_info(const char *log_lvl, struct task_struct *task) */ probe_kernel_read(&fn, &worker->current_func, sizeof(fn)); probe_kernel_read(&pwq, &worker->current_pwq, sizeof(pwq)); - probe_kernel_read(&wq, &pwq->wq, sizeof(wq)); - probe_kernel_read(name, wq->name, sizeof(name) - 1); + /* current_pwq is NULL for 030 'I' tasks, and this would fault 2x */ + if (pwq) { + probe_kernel_read(&wq, &pwq->wq, sizeof(wq)); + probe_kernel_read(name, wq->name, sizeof(name) - 1); + } probe_kernel_read(desc, worker->desc, sizeof(desc) - 1); if (fn || name[0] || desc[0]) {