On 11/25/24 2:23 PM, Guenter Roeck wrote:
On 11/25/24 10:12, Sebastian Andrzej Siewior wrote:
On 2024-11-25 09:59:09 [-0800], Guenter Roeck wrote:
On 11/25/24 09:43, Sebastian Andrzej Siewior wrote:
On 2024-11-25 09:01:33 [-0800], Guenter Roeck wrote:
Unfortunately it doesn't make a difference.
stunning. It looks like the exact same error message.
I think it uses
#define spin_lock_irqsave(lock, flags) \
do { \
raw_spin_lock_irqsave(spinlock_check(lock), flags); \
} while (0)
from include/linux/spinlock.h, meaning your patch doesn't really
make a difference.
The difference comes from DEFINE_SPINLOCK vs DEFINE_RAW_SPINLOCK. There
is the .lock_type init which goes from LD_WAIT_CONFIG to LD_WAIT_SPIN.
And this is all it matters.
Ah, now I get it. Thanks for the explanation. And it turns out my log
was wrong.
I must have taken it from the old image. Sorry for that.
That specific backtrace isn't seen anymore. But there is another one.
[ 1.779653] =============================
[ 1.779860] [ BUG: Invalid wait context ]
[ 1.780139] 6.12.0+ #1 Not tainted
[ 1.780394] -----------------------------
[ 1.780600] swapper/0/1 is trying to lock:
[ 1.780824] 0000000001b68888 (cpu_map_lock){....}-{3:3}, at:
map_to_cpu+0x10/0x80
[ 1.781393] other info that might help us debug this:
[ 1.781624] context-{5:5}
[ 1.781838] 3 locks held by swapper/0/1:
[ 1.782055] #0: fffff800042b90f8 (&dev->mutex){....}-{4:4}, at:
__driver_attach+0x80/0x160
[ 1.782345] #1: fffff800040f2c18
(&desc->request_mutex){+.+.}-{4:4}, at: __setup_irq+0xa0/0x6e0
[ 1.782632] #2: fffff800040f2ab0
(&irq_desc_lock_class){....}-{2:2}, at: __setup_irq+0xc8/0x6e0
[ 1.782912] stack backtrace:
[ 1.783172] CPU: 0 UID: 0 PID: 1 Comm: swapper/0 Not tainted
6.12.0+ #1
[ 1.783498] Call Trace:
[ 1.783734] [<00000000004e31d0>] __lock_acquire+0xa50/0x3160
[ 1.783971] [<00000000004e63e8>] lock_acquire+0xe8/0x340
[ 1.784191] [<00000000010f0dbc>] _raw_spin_lock_irqsave+0x3c/0x80
[ 1.784417] [<000000000043ed90>] map_to_cpu+0x10/0x80
[ 1.784633] [<000000000042b2b8>] sun4u_irq_enable+0x18/0x80
[ 1.784854] [<00000000004fb6b4>] irq_enable+0x34/0xc0
[ 1.785069] [<00000000004fb7b8>] __irq_startup+0x78/0xe0
[ 1.785287] [<00000000004fb8f0>] irq_startup+0xd0/0x1a0
[ 1.785503] [<00000000004f85b4>] __setup_irq+0x5f4/0x6e0
[ 1.785726] [<00000000004f8754>] request_threaded_irq+0xb4/0x1a0
[ 1.785950] [<0000000000439930>] power_probe+0x70/0xe0
[ 1.786165] [<0000000000c13a68>] platform_probe+0x28/0x80
[ 1.786382] [<0000000000c11178>] really_probe+0xb8/0x340
[ 1.786599] [<0000000000c115a4>] driver_probe_device+0x24/0xe0
[ 1.786820] [<0000000000c117cc>] __driver_attach+0x8c/0x160
[ 1.787039] [<0000000000c0ef74>] bus_for_each_dev+0x54/0xc0
After replacing cpu_map_lock with a raw spinlock, I get:
[ 2.015140] =============================
[ 2.015247] [ BUG: Invalid wait context ]
[ 2.015419] 6.12.0+ #1 Not tainted
[ 2.015564] -----------------------------
[ 2.015668] swapper/0/1 is trying to lock:
[ 2.015791] fffff80004870610 (&mm->context.lock){....}-{3:3}, at:
__schedule+0x410/0x5b0
[ 2.016306] other info that might help us debug this:
[ 2.016451] context-{5:5}
[ 2.016539] 3 locks held by swapper/0/1:
[ 2.016652] #0: 0000000001d11f38 (key_types_sem){++++}-{4:4}, at:
__key_create_or_update+0x5c/0x4c0
[ 2.016934] #1: 0000000001d1b850
(asymmetric_key_parsers_sem){++++}-{4:4}, at:
asymmetric_key_preparse+0x18/0xa0
[ 2.017197] #2: fffff8001f811a98 (&rq->__lock){-.-.}-{2:2}, at:
__schedule+0xdc/0x5b0
[ 2.017412] stack backtrace:
[ 2.017551] CPU: 0 UID: 0 PID: 1 Comm: swapper/0 Not tainted
6.12.0+ #1
[ 2.017800] Call Trace:
[ 2.017910] [<00000000004e31d0>] __lock_acquire+0xa50/0x3160
[ 2.018062] [<00000000004e63e8>] lock_acquire+0xe8/0x340
[ 2.018192] [<00000000010f0dbc>] _raw_spin_lock_irqsave+0x3c/0x80
[ 2.018341] [<00000000010e5050>] __schedule+0x410/0x5b0
[ 2.018469] [<00000000010e5ae4>] schedule+0x44/0x1c0
[ 2.018591] [<00000000010f0684>] schedule_timeout+0xa4/0x100
[ 2.018730] [<00000000010e668c>] __wait_for_common+0xac/0x1a0
[ 2.018869] [<00000000010e6878>] wait_for_completion_state+0x18/0x40
[ 2.019022] [<000000000048ad18>] call_usermodehelper_exec+0x138/0x1c0
[ 2.019177] [<000000000052eb40>] __request_module+0x160/0x2e0
[ 2.019316] [<00000000009ba6dc>] crypto_alg_mod_lookup+0x17c/0x280
[ 2.019466] [<00000000009ba990>] crypto_alloc_tfm_node+0x30/0x100
[ 2.019614] [<00000000009dcc5c>]
public_key_verify_signature+0xbc/0x260
[ 2.019772] [<00000000009ded8c>] x509_check_for_self_signed+0xac/0x280
[ 2.019928] [<00000000009dddec>] x509_cert_parse+0x14c/0x220
[ 2.020065] [<00000000009dea08>] x509_key_preparse+0x8/0x1e0
The problem here is
typedef struct {
spinlock_t lock; <--
unsigned long sparc64_ctx_val;
unsigned long hugetlb_pte_count;
unsigned long thp_pte_count;
struct tsb_config tsb_block[MM_NUM_TSBS];
struct hv_tsb_descr tsb_descr[MM_NUM_TSBS];
void *vdso;
bool adi;
tag_storage_desc_t *tag_store;
spinlock_t tag_lock;
} mm_context_t;
Replacing that with a raw spinlock just triggers the next one.
[ 2.035384] =============================
[ 2.035490] [ BUG: Invalid wait context ]
[ 2.035660] 6.12.0+ #3 Not tainted
[ 2.035802] -----------------------------
[ 2.035906] kworker/u4:3/48 is trying to lock:
[ 2.036036] 0000000001b6a790 (ctx_alloc_lock){....}-{3:3}, at:
get_new_mmu_context+0x14/0x280
[ 2.036558] other info that might help us debug this:
[ 2.036697] context-{5:5}
[ 2.036784] 4 locks held by kworker/u4:3/48:
[ 2.036906] #0: fffff80004838a70
(&sig->cred_guard_mutex){+.+.}-{4:4}, at: bprm_execve+0xc/0x8e0
[ 2.037169] #1: fffff80004838b08
(&sig->exec_update_lock){+.+.}-{4:4}, at: begin_new_exec+0x344/0xbe0
[ 2.037411] #2: fffff800047fc940 (&p->alloc_lock){+.+.}-{3:3}, at:
begin_new_exec+0x3a0/0xbe0
[ 2.037639] #3: fffff80004848610 (&mm->context.lock){....}-{2:2},
at: begin_new_exec+0x41c/0xbe0
Fixing that finally gives me a clean run. Nevertheless, that makes me
wonder:
Should I just disable CONFIG_PROVE_RAW_LOCK_NESTING for sparc runtime
tests ?
If no one is tryng to ever enable PREEMPT_RT on SPARC, I suppose you
could disable CONFIG_PROVE_RAW_LOCK_NESTING to avoid the trouble.
Cheers,
Longman