Hi Geert,
Am 13.03.2024 um 13:16 schrieb Michael Schmitz:
Hi Geert,
On 12/03/24 20:59, Geert Uytterhoeven wrote:
On Tue, Mar 12, 2024 at 1:51 AM Michael Schmitz <schmitzmic@xxxxxxxxx>
wrote:
On 11/03/24 21:35, Finn Thain wrote:
I think spin_lock() reduces to preempt_disable() on UP.
In include/linux/spinlock_api_up.h it says,
/*
* In the UP-nondebug case there's no real locking going on, so the
* only thing we have to do is to keep the preempt counts and irq
* flags straight, to suppress compiler warnings of unused lock
* variables, and to add the proper checker annotations:
*/
That's only true in the debug case - there, preempt_disable() is used
inside the spin loop. But m68k is one of the last CONFIG_PREEMPT_NONE
archs AFAIR, and preempt_disable() reduces to barrier() on those.
M68k does have experimental preempt support. I have been running
that for the last 5 months. Works fine most of the time, except for
the one BUG[1] that happens every 10 boots or so.
[1]
https://lore.kernel.org/all/CAMuHMdUQ72KOPw5vxNfhjoTR-SsaELeKneBmyQPYEWa_o5OZZA@xxxxxxxxxxxxxx
NB: the Kconfig hunk no longer applies, it now should look like this:
diff --git a/arch/m68k/Kconfig b/arch/m68k/Kconfig
index 4b3e93cac723..0c4b5180df1d 100644
--- a/arch/m68k/Kconfig
+++ b/arch/m68k/Kconfig
@@ -10,7 +10,6 @@ config M68K
select ARCH_HAS_SYNC_DMA_FOR_DEVICE if M68K_NONCOHERENT_DMA
select ARCH_HAVE_NMI_SAFE_CMPXCHG if RMW_INSNS
select ARCH_MIGHT_HAVE_PC_PARPORT if ISA
- select ARCH_NO_PREEMPT if !COLDFIRE
select ARCH_USE_MEMTEST if MMU_MOTOROLA
select ARCH_WANT_IPC_PARSE_VERSION
select BINFMT_FLAT_ARGVP_ENVP_ON_STACK
(HAS_DMA -> M68K_NONCOHERENT_DMA)
Runs fine on ARAnyM with a bit of VM stress testing using
PREEMPT_VOLUNTARY. After a bit of looping (init=/sbin/reboot), fails
spectacularly with not finding any root filesystem to mount when using
full preempt.
Trying to boot a full system, I see the 'table already freed' panic on
the first go.
Running a stack-ng stack-fill stressor triggered the panic, too. I'll
see how reliable that is.
That's actually pretty reliable - running stress-ng --stack 2
--stack-fill two or three times gets me a panic (on a fast Intel system).
Running full preemption on the PowerBook instance, the older one of my
disk images boots OK (sysvinit based), the newer one (systemd IIRC)
cheerfully throws endless Oops of this kind:
Unable to handle kernel NULL pointer dereference at virtual address b90995fb
Oops: 00000000
Modules linked in:
PC: [<000de1d6>] del_page_from_free_list+0x16/0x46
SR: 2700 SP: 3e9c0737 a2: 00f34610
d0: 00000000 d1: 0054491c d2: 0000000b d3: 00000000
d4: 0000000f d5: 00000000 a0: 0054491c a1: 00495dd0
Process vol_id (pid: 1198, task=54e40596)
Frame format=7 eff addr=00000104 ssw=0405 faddr=00000104
wb 1 stat/addr/data: 0000 00000000 00000000
wb 2 stat/addr/data: 0000 00000000 00000000
wb 3 stat/addr/data: 0085 00000104 00495dd0
push data: 00000000 00000000 00000000 00000000
Stack from 0482fbc4:
0076343c 000df4ce 0054491c 00495d78 00000000 00000000 00000801 00000000
00000000 00002000 0076343c 00495d78 000df8de 0482e000 0482fd06 0000001e
0054491c 00544920 00000000 00495f84 00002700 00000058 00000058 00495d78
000dfb70 00495d78 00000000 00000000 00000801 0076343c 00763464 00152c40
00000000 00112cc0 00000000 00000000 000b493c 0482fd06 000df974 01023764
0032d0e8 00dd7400 00f1fe58 00000049 64560000 00010000 0000f000 00000a00
Call Trace: [<000df4ce>] __rmqueue_pcplist+0x198/0x316
[<00002000>] _start+0x0/0x8
[<000df8de>] zone_watermark_fast.isra.112+0x0/0x96
[<00002700>] resume_userspace+0xa/0x16
[<000dfb70>] get_page_from_freelist+0x1fc/0x632
[<00152c40>] ext4_ext_convert_to_initialized+0x50c/0x534
[<00112cc0>] simple_transaction_get+0x52/0x98
[<000b493c>] filemap_add_folio+0x0/0x9e
[<000df974>] get_page_from_freelist+0x0/0x632
[<0032d0e8>] xa_load+0x0/0x76
[<00010000>] TWOMAIN+0x7a/0x80
[<0000f000>] NEVEN+0xd8/0xec
[<000df7d2>] prepare_alloc_pages.isra.110+0x70/0x84
[<000e0f4e>] __alloc_pages+0xa8/0x6b2
[<00152c40>] ext4_ext_convert_to_initialized+0x50c/0x534
[<00112cc0>] simple_transaction_get+0x52/0x98
[<000b493c>] filemap_add_folio+0x0/0x9e
[<0032d0e8>] xa_load+0x0/0x76
[<00004ec2>] buserr_c+0x10c/0x3e6
[<00001000>] kernel_pg_dir+0x0/0x1000
[<00010000>] TWOMAIN+0x7a/0x80
[<0032cf2a>] xas_load+0x1e/0x60
[<0032cf0c>] xas_load+0x0/0x60
[<0032d150>] xa_load+0x68/0x76
[<000e1a46>] __folio_alloc+0x1c/0x2a
[<00152cc0>] ext4_ext_index_trans_blocks+0x0/0x3a
[<000bace0>] page_cache_ra_unbounded+0xb0/0x140
[<00112cc0>] simple_transaction_get+0x52/0x98
[<00010fff>] X_OPERR+0x3/0x40
[<000b4c24>] filemap_get_pages+0x24a/0x4e0
[<00008000>] nvram_proc_read+0x4c/0x31a
[<001000f0>] fiemap_prep+0x6c/0xc2
[<000f0000>] vfs_writev+0x2c/0x16e
[<001c0000>] bio_may_exceed_limits.isra.35+0x8/0x54
[<000b4f66>] filemap_read+0xac/0x30c
[<00010000>] TWOMAIN+0x7a/0x80
[<00010000>] TWOMAIN+0x7a/0x80
[<00010000>] TWOMAIN+0x7a/0x80
[<00001000>] kernel_pg_dir+0x0/0x1000
[<00008000>] nvram_proc_read+0x4c/0x31a
[<00008000>] nvram_proc_read+0x4c/0x31a
[<00001000>] kernel_pg_dir+0x0/0x1000
[<00007000>] atari_tt_hwclk+0x16c/0x2fc
[<00003000>] arch_ptrace+0xa6/0x24a
[<00001000>] kernel_pg_dir+0x0/0x1000
[<001b5716>] blkdev_read_iter+0x118/0x138
[<00010000>] TWOMAIN+0x7a/0x80
[<00008000>] nvram_proc_read+0x4c/0x31a
[<00010000>] TWOMAIN+0x7a/0x80
[<000f0a68>] vfs_read+0xcc/0x128
[<00008000>] nvram_proc_read+0x4c/0x31a
[<00010000>] TWOMAIN+0x7a/0x80
[<000f0dfc>] ksys_read+0x4a/0x86
[<00010000>] TWOMAIN+0x7a/0x80
[<00010000>] TWOMAIN+0x7a/0x80
[<0000fff6>] TWOMAIN+0x70/0x80
[<000f0e4a>] sys_read+0x12/0x18
[<00010000>] TWOMAIN+0x7a/0x80
[<0000269a>] syscall+0x8/0xc
[<00010000>] TWOMAIN+0x7a/0x80
[<00010000>] TWOMAIN+0x7a/0x80
[<0000c00c>] ATANTBL+0x114/0x800
Code: 202f 0010 2668 0004 2268 0008 2749 0004 <228b> 217c 0000 0100 0004 217c 0000 0122 0008 0068 0080 001a 42a8 0014 2200 e589
Disabling lock debugging due to kernel taint
note: vol_id[1198] exited with irqs disabled
note: vol_id[1198] exited with preempt_count 3
(killed ARAnyM after 26 of those).
Booted OK on the next boot, produced a 'table already free' panic after
three or four runs of the stack-fill stress test.
Cheers,
Michael
Cheers,
Michael
Gr{oetje,eeting}s,
Geert
--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 --
geert@xxxxxxxxxxxxxx
In personal conversations with technical people, I call myself a
hacker. But
when I'm talking to journalists I just say "programmer" or something
like that.
-- Linus Torvalds