Re: Oopses and invalid addresses under Hatari

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Eero,

Am 13.04.2019 um 09:43 schrieb Eero Tamminen:
Hi,

On 4/12/19 9:52 AM, Michael Schmitz wrote:
Am 12.04.2019 um 11:03 schrieb Eero Tamminen:
[...]
* Stack is always shown, but call trace following it is always empty.
  Is call trace explicitly disabled for m68k task list?

No, must be a 030 thing. The output on 060 does show a call trace (at
least for normal processes).

Ok, that's one more bug.

I'm not convinced the call trace shown on 060 makes much sense at all:

[5071106.760000] systemd-udevd   S    0   175      1 0x00000000
[5071106.770000] Stack from 0749dfcc:
0000000a ef915664 00000009 ffffffff ffffffff c02a0d00
                         000000fb 000000fb 00000000 0018c00f ca980080
[5071106.780000] Call Trace: [<0018c00f>] falcon_decode_var+0x46b/0x912

That's on Amiga. Should not run any code from atafb.c at all.


=> *All* of them are kernel threads (kthreadd children) in 'I' state
   ('I' = interrupt context?)

Unlikely - may be interruptible sleep.

Looking at sched_show_task() -> task_state_to_char() -> sched.h, "I"
means TASK_IDLE i.e. those kernel threads are both non-interruptible
(same as "D"), and with no load.

OK, so the issue is that idle kernel threads have no wq associated with them, and print_worker_info() depends on a wq present. But __probe_kernel_read() is meant to handle this gracefully.

The real question is - why are these fields NULL in the first place? >
And are they NULL only on 030?

I'm very interested in this too.

I suspect the m68k stack frame format is munged up where it gets used in core kernel code. The only reason I can imagine for that would be different assumptions about alignment.

Attached patch fixes the Oops for me.

I guess __probe_kernel_read() was meant to make checking for NULL
pointers obsolete in these functions (where fields may well be NULL
depending on context). I don't think your patch would be accepted,
when a fix in the 030 fault handler does the job just as well.

*If* those fields are NULL also on other arches, going through fault
handler for nearly half of tasks is pretty suboptimal.  I.e. that one

Probably won't be anywhere near half the tasks on a more recent system running loads of processes. But your point is taken. I thought hacking the page fault handler to stay silent when accessing kernel memory where we're not quite clear about the semantics was a pretty crude hack, but there may be a good reason for this.

extra "if" can also be considered as an optimization for the common
case.

If that's the only field that may become NULL, yes. I just don't know what the impact would be on other archs. Not that this is a very heavily used code path ...

Task list is a debugging feature and it causing page faults won't help
with debugging.

That sort of patch is best discussed on the main kernel list. But we really need to find out whether the sysrq trace output even remotely matches what you can glean from procfs entries or peeking at kernel memory directly.

Cheers,

	Michael



    - Eero



[Index of Archives]     [Video for Linux]     [Yosemite News]     [Linux S/390]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux