Re: Oopses and invalid addresses under Hatari

Michael Schmitz <schmitzmic@xxxxxxxxx> · Sat, 13 Apr 2019 11:15:34 +1200

Hi Eero,

Am 13.04.2019 um 09:43 schrieb Eero Tamminen:
Hi,

On 4/12/19 9:52 AM, Michael Schmitz wrote:
Am 12.04.2019 um 11:03 schrieb Eero Tamminen:
[...]
* Stack is always shown, but call trace following it is always empty.
  Is call trace explicitly disabled for m68k task list?

No, must be a 030 thing. The output on 060 does show a call trace (at
least for normal processes).

Ok, that's one more bug.

I'm not convinced the call trace shown on 060 makes much sense at all:

[5071106.760000] systemd-udevd   S    0   175      1 0x00000000
[5071106.770000] Stack from 0749dfcc:
                         0000000a ef915664 00000009 ffffffff ffffffff 
c02a0d00
                         000000fb 000000fb 00000000 0018c00f ca980080
[5071106.780000] Call Trace: [<0018c00f>] falcon_decode_var+0x46b/0x912

That's on Amiga. Should not run any code from atafb.c at all.

=> *All* of them are kernel threads (kthreadd children) in 'I' state
   ('I' = interrupt context?)

Unlikely - may be interruptible sleep.

Looking at sched_show_task() -> task_state_to_char() -> sched.h, "I"
means TASK_IDLE i.e. those kernel threads are both non-interruptible
(same as "D"), and with no load.

OK, so the issue is that idle kernel threads have no wq associated with 
them, and print_worker_info() depends on a wq present. But 
__probe_kernel_read() is meant to handle this gracefully.

The real question is - why are these fields NULL in the first place? >
And are they NULL only on 030?

I'm very interested in this too.

I suspect the m68k stack frame format is munged up where it gets used in 
core kernel code. The only reason I can imagine for that would be 
different assumptions about alignment.

Attached patch fixes the Oops for me.

I guess __probe_kernel_read() was meant to make checking for NULL
pointers obsolete in these functions (where fields may well be NULL
depending on context). I don't think your patch would be accepted,
when a fix in the 030 fault handler does the job just as well.

*If* those fields are NULL also on other arches, going through fault
handler for nearly half of tasks is pretty suboptimal.  I.e. that one

Probably won't be anywhere near half the tasks on a more recent system 
running loads of processes. But your point is taken. I thought hacking 
the page fault handler to stay silent when accessing kernel memory where 
we're not quite clear about the semantics was a pretty crude hack, but 
there may be a good reason for this.

extra "if" can also be considered as an optimization for the common
case.

If that's the only field that may become NULL, yes. I just don't know 
what the impact would be on other archs. Not that this is a very heavily 
used code path ...

Task list is a debugging feature and it causing page faults won't help
with debugging.

That sort of patch is best discussed on the main kernel list. But we 
really need to find out whether the sysrq trace output even remotely 
matches what you can glean from procfs entries or peeking at kernel 
memory directly.

Cheers,

	Michael

    - Eero