Preemption (was: Re: spinlock recursion when running q800 emulation in qemu)

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Geert,

Am 13.03.2024 um 13:16 schrieb Michael Schmitz:
Hi Geert,

On 12/03/24 20:59, Geert Uytterhoeven wrote:
On Tue, Mar 12, 2024 at 1:51 AM Michael Schmitz <schmitzmic@xxxxxxxxx>
wrote:
On 11/03/24 21:35, Finn Thain wrote:
I think spin_lock() reduces to preempt_disable() on UP.
In include/linux/spinlock_api_up.h it says,

/*
   * In the UP-nondebug case there's no real locking going on, so the
   * only thing we have to do is to keep the preempt counts and irq
   * flags straight, to suppress compiler warnings of unused lock
   * variables, and to add the proper checker annotations:
   */
That's only true in the debug case - there, preempt_disable() is used
inside the spin loop. But m68k is one of the last CONFIG_PREEMPT_NONE
archs AFAIR, and preempt_disable() reduces to barrier() on those.
M68k does have experimental preempt support. I have been running
that for the last 5 months. Works fine most of the time, except for
the one BUG[1] that happens every 10 boots or so.

[1]
https://lore.kernel.org/all/CAMuHMdUQ72KOPw5vxNfhjoTR-SsaELeKneBmyQPYEWa_o5OZZA@xxxxxxxxxxxxxx


NB: the Kconfig hunk no longer applies, it now should look like this:

diff --git a/arch/m68k/Kconfig b/arch/m68k/Kconfig
index 4b3e93cac723..0c4b5180df1d 100644
--- a/arch/m68k/Kconfig
+++ b/arch/m68k/Kconfig
@@ -10,7 +10,6 @@ config M68K
        select ARCH_HAS_SYNC_DMA_FOR_DEVICE if M68K_NONCOHERENT_DMA
        select ARCH_HAVE_NMI_SAFE_CMPXCHG if RMW_INSNS
        select ARCH_MIGHT_HAVE_PC_PARPORT if ISA
-       select ARCH_NO_PREEMPT if !COLDFIRE
        select ARCH_USE_MEMTEST if MMU_MOTOROLA
        select ARCH_WANT_IPC_PARSE_VERSION
        select BINFMT_FLAT_ARGVP_ENVP_ON_STACK

(HAS_DMA -> M68K_NONCOHERENT_DMA)

Runs fine on ARAnyM with a bit of VM stress testing using
PREEMPT_VOLUNTARY. After a bit of looping (init=/sbin/reboot), fails
spectacularly with not finding any root filesystem to mount when using
full preempt.

Trying to boot a full system, I see the 'table already freed' panic on
the first go.

Running a stack-ng stack-fill stressor triggered the panic, too. I'll
see how reliable that is.

That's actually pretty reliable - running stress-ng --stack 2 --stack-fill two or three times gets me a panic (on a fast Intel system).

Running full preemption on the PowerBook instance, the older one of my disk images boots OK (sysvinit based), the newer one (systemd IIRC) cheerfully throws endless Oops of this kind:

Unable to handle kernel NULL pointer dereference at virtual address b90995fb
Oops: 00000000
Modules linked in:
PC: [<000de1d6>] del_page_from_free_list+0x16/0x46
SR: 2700  SP: 3e9c0737  a2: 00f34610
d0: 00000000    d1: 0054491c    d2: 0000000b    d3: 00000000
d4: 0000000f    d5: 00000000    a0: 0054491c    a1: 00495dd0
Process vol_id (pid: 1198, task=54e40596)
Frame format=7 eff addr=00000104 ssw=0405 faddr=00000104
wb 1 stat/addr/data: 0000 00000000 00000000
wb 2 stat/addr/data: 0000 00000000 00000000
wb 3 stat/addr/data: 0085 00000104 00495dd0
push data: 00000000 00000000 00000000 00000000
Stack from 0482fbc4:
        0076343c 000df4ce 0054491c 00495d78 00000000 00000000 00000801 00000000
        00000000 00002000 0076343c 00495d78 000df8de 0482e000 0482fd06 0000001e
        0054491c 00544920 00000000 00495f84 00002700 00000058 00000058 00495d78
        000dfb70 00495d78 00000000 00000000 00000801 0076343c 00763464 00152c40
        00000000 00112cc0 00000000 00000000 000b493c 0482fd06 000df974 01023764
        0032d0e8 00dd7400 00f1fe58 00000049 64560000 00010000 0000f000 00000a00
Call Trace: [<000df4ce>] __rmqueue_pcplist+0x198/0x316
 [<00002000>] _start+0x0/0x8
 [<000df8de>] zone_watermark_fast.isra.112+0x0/0x96
 [<00002700>] resume_userspace+0xa/0x16
 [<000dfb70>] get_page_from_freelist+0x1fc/0x632
 [<00152c40>] ext4_ext_convert_to_initialized+0x50c/0x534
 [<00112cc0>] simple_transaction_get+0x52/0x98
 [<000b493c>] filemap_add_folio+0x0/0x9e
 [<000df974>] get_page_from_freelist+0x0/0x632
 [<0032d0e8>] xa_load+0x0/0x76
 [<00010000>] TWOMAIN+0x7a/0x80
 [<0000f000>] NEVEN+0xd8/0xec
 [<000df7d2>] prepare_alloc_pages.isra.110+0x70/0x84
 [<000e0f4e>] __alloc_pages+0xa8/0x6b2
 [<00152c40>] ext4_ext_convert_to_initialized+0x50c/0x534
 [<00112cc0>] simple_transaction_get+0x52/0x98
 [<000b493c>] filemap_add_folio+0x0/0x9e
 [<0032d0e8>] xa_load+0x0/0x76
 [<00004ec2>] buserr_c+0x10c/0x3e6
 [<00001000>] kernel_pg_dir+0x0/0x1000
 [<00010000>] TWOMAIN+0x7a/0x80
 [<0032cf2a>] xas_load+0x1e/0x60
 [<0032cf0c>] xas_load+0x0/0x60
 [<0032d150>] xa_load+0x68/0x76
 [<000e1a46>] __folio_alloc+0x1c/0x2a
 [<00152cc0>] ext4_ext_index_trans_blocks+0x0/0x3a
 [<000bace0>] page_cache_ra_unbounded+0xb0/0x140
 [<00112cc0>] simple_transaction_get+0x52/0x98
 [<00010fff>] X_OPERR+0x3/0x40
 [<000b4c24>] filemap_get_pages+0x24a/0x4e0
 [<00008000>] nvram_proc_read+0x4c/0x31a
 [<001000f0>] fiemap_prep+0x6c/0xc2
 [<000f0000>] vfs_writev+0x2c/0x16e
 [<001c0000>] bio_may_exceed_limits.isra.35+0x8/0x54
 [<000b4f66>] filemap_read+0xac/0x30c
 [<00010000>] TWOMAIN+0x7a/0x80
 [<00010000>] TWOMAIN+0x7a/0x80
 [<00010000>] TWOMAIN+0x7a/0x80
 [<00001000>] kernel_pg_dir+0x0/0x1000
 [<00008000>] nvram_proc_read+0x4c/0x31a
 [<00008000>] nvram_proc_read+0x4c/0x31a
 [<00001000>] kernel_pg_dir+0x0/0x1000
 [<00007000>] atari_tt_hwclk+0x16c/0x2fc
 [<00003000>] arch_ptrace+0xa6/0x24a
 [<00001000>] kernel_pg_dir+0x0/0x1000
 [<001b5716>] blkdev_read_iter+0x118/0x138
 [<00010000>] TWOMAIN+0x7a/0x80
 [<00008000>] nvram_proc_read+0x4c/0x31a
 [<00010000>] TWOMAIN+0x7a/0x80
 [<000f0a68>] vfs_read+0xcc/0x128
 [<00008000>] nvram_proc_read+0x4c/0x31a
 [<00010000>] TWOMAIN+0x7a/0x80
 [<000f0dfc>] ksys_read+0x4a/0x86
 [<00010000>] TWOMAIN+0x7a/0x80
 [<00010000>] TWOMAIN+0x7a/0x80
 [<0000fff6>] TWOMAIN+0x70/0x80
 [<000f0e4a>] sys_read+0x12/0x18
 [<00010000>] TWOMAIN+0x7a/0x80
 [<0000269a>] syscall+0x8/0xc
 [<00010000>] TWOMAIN+0x7a/0x80
 [<00010000>] TWOMAIN+0x7a/0x80
 [<0000c00c>] ATANTBL+0x114/0x800

Code: 202f 0010 2668 0004 2268 0008 2749 0004 <228b> 217c 0000 0100 0004 217c 0000 0122 0008 0068 0080 001a 42a8 0014 2200 e589
Disabling lock debugging due to kernel taint
note: vol_id[1198] exited with irqs disabled
note: vol_id[1198] exited with preempt_count 3

(killed ARAnyM after 26 of those).

Booted OK on the next boot, produced a 'table already free' panic after three or four runs of the stack-fill stress test.

Cheers,

	Michael



Cheers,

    Michael



Gr{oetje,eeting}s,

                         Geert


--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 --
geert@xxxxxxxxxxxxxx

In personal conversations with technical people, I call myself a
hacker. But
when I'm talking to journalists I just say "programmer" or something
like that.
                                 -- Linus Torvalds




[Index of Archives]     [Video for Linux]     [Yosemite News]     [Linux S/390]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux