Re: hung tasks on shutdown in linux-next-202409{20,23,24,25}

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Am Mittwoch, dem 25.09.2024 um 14:09 +0200 schrieb Greg Kroah-Hartman:
>
>
> Thanks for the report, I _just_ reverted all of these in my branch due
> to another report just like this.  I'll be glad to take them back after
> -rc1 if these issues can be worked out.
>
> So the next linux-next release should be good, OR if you could pull my
> driver-core.git driver-core-next branch to verify the revert worked for
> you, that would be great.
>
> thanks,
>
> greg k-h

The situation is a little more complicated: Your branch (driver-core-next) works
fine(I just retested 10 reboot cycles with driver-core-next, commit 4f2c346e6216
as HEAD). The problems only occur when your branch is merged into linux-next. 
I had the suspicion that the bug is locking related and recompiled next-20240925
with CONFIG_LOCKDEP=y.

These are the lock debugging option I used:

#
# Lock Debugging (spinlocks, mutexes, etc...)
#
CONFIG_LOCK_DEBUGGING_SUPPORT=y
CONFIG_PROVE_LOCKING=y
# CONFIG_PROVE_RAW_LOCK_NESTING is not set
# CONFIG_LOCK_STAT is not set
CONFIG_DEBUG_RT_MUTEXES=y
CONFIG_DEBUG_SPINLOCK=y
CONFIG_DEBUG_MUTEXES=y
CONFIG_DEBUG_WW_MUTEX_SLOWPATH=y
CONFIG_DEBUG_RWSEMS=y
CONFIG_DEBUG_LOCK_ALLOC=y
CONFIG_LOCKDEP=y
CONFIG_LOCKDEP_BITS=15
CONFIG_LOCKDEP_CHAINS_BITS=16
CONFIG_LOCKDEP_STACK_TRACE_BITS=19
CONFIG_LOCKDEP_STACK_TRACE_HASH_BITS=14
CONFIG_LOCKDEP_CIRCULAR_QUEUE_BITS=12
# CONFIG_DEBUG_LOCKDEP is not set
# CONFIG_DEBUG_ATOMIC_SLEEP is not set
# CONFIG_DEBUG_LOCKING_API_SELFTESTS is not set
# CONFIG_LOCK_TORTURE_TEST is not set
# CONFIG_WW_MUTEX_SELFTEST is not set
# CONFIG_SCF_TORTURE_TEST is not set
# CONFIG_CSD_LOCK_WAIT_DEBUG is not set
# end of Lock Debugging (spinlocks, mutexes, etc...)

With these .config options the bug becomes harder to trigger, but after 11
reboots
I finally got a screen flooded with messages of the following type:

2 locks held by kworker/u64:251/3047
#0: ffff9fdf80d39548 ((wq_completion)async){+.+.}-{0:0}, at
process_one_work+0x4a4/0x580
#1: ffffb54b11307e58 ((work_completion)(&entry->work)){+.+.}-{0:0}, at
process_one_work+0x1c7/0x580


Bert Karwatzki






[Index of Archives]     [Linux Kernel]     [Linux USB Development]     [Yosemite News]     [Linux SCSI]

  Powered by Linux