On 2018/06/19 20:10, Tetsuo Handa wrote: > On 2018/06/16 4:40, Tetsuo Handa wrote: >> Hmm, there might be other locations calling percpu_rwsem_release() ? > > There are other locations calling percpu_rwsem_release(), but quite few. > > include/linux/fs.h:1494:#define __sb_writers_release(sb, lev) \ > include/linux/fs.h-1495- percpu_rwsem_release(&(sb)->s_writers.rw_sem[(lev)-1], 1, _THIS_IP_) > > fs/btrfs/transaction.c:1821: __sb_writers_release(fs_info->sb, SB_FREEZE_FS); > fs/aio.c:1566: __sb_writers_release(file_inode(file)->i_sb, SB_FREEZE_WRITE); > fs/xfs/xfs_aops.c:211: __sb_writers_release(ioend->io_inode->i_sb, SB_FREEZE_FS); > > > > I'd like to check what atomic_long_read(&sem->rw_sem.count) says > when hung task is reported. > syzbot reproduced this problem with the patch applied. percpu_rw_semaphore(00000000082ac9da) ->rw_sem.count=0xfffffffe00000001 ->rss.gp_state=2 ->rss.gp_count=1 ->rss.cb_state=0 ->rss.gp_type=1 ->readers_block=1 ->read_count=0 ->list_empty(rw_sem.wait_list)=0 ->writer.task= (null) The output says that percpu_down_read() was blocked because somebody has called percpu_down_write(). DEFINE_STATIC_PERCPU_RWSEM(sem); percpu_down_write(&sem); percpu_down_read(&sem); percpu_up_read(&sem); percpu_up_write(&sem); The next step is to find who is calling percpu_down_write(). How do we want to do this? We don't want to annoy normal linux-next.git testers. Below one? --- include/linux/percpu-rwsem.h | 4 ++++ lib/Kconfig.debug | 6 ++++++ 2 files changed, 10 insertions(+) diff --git a/include/linux/percpu-rwsem.h b/include/linux/percpu-rwsem.h index 79b99d6..26e87c3 100644 --- a/include/linux/percpu-rwsem.h +++ b/include/linux/percpu-rwsem.h @@ -130,7 +130,9 @@ extern int __percpu_init_rwsem(struct percpu_rw_semaphore *, static inline void percpu_rwsem_release(struct percpu_rw_semaphore *sem, bool read, unsigned long ip) { +#ifndef CONFIG_DEBUG_AID_FOR_SYZBOT lock_release(&sem->rw_sem.dep_map, 1, ip); +#endif #ifdef CONFIG_RWSEM_SPIN_ON_OWNER if (!read) sem->rw_sem.owner = RWSEM_OWNER_UNKNOWN; @@ -140,7 +142,9 @@ static inline void percpu_rwsem_release(struct percpu_rw_semaphore *sem, static inline void percpu_rwsem_acquire(struct percpu_rw_semaphore *sem, bool read, unsigned long ip) { +#ifndef CONFIG_DEBUG_AID_FOR_SYZBOT lock_acquire(&sem->rw_sem.dep_map, 0, 1, read, 1, NULL, ip); +#endif #ifdef CONFIG_RWSEM_SPIN_ON_OWNER if (!read) sem->rw_sem.owner = current; diff --git a/lib/Kconfig.debug b/lib/Kconfig.debug index c731ff9..f0d02e8 100644 --- a/lib/Kconfig.debug +++ b/lib/Kconfig.debug @@ -1181,6 +1181,12 @@ config DEBUG_LOCK_ALLOC spin_lock_init()/mutex_init()/etc., or whether there is any lock held during task exit. +config DEBUG_AID_FOR_SYZBOT + bool "Additional debug options for syzbot" + default n + help + This option is intended for testing by syzbot. + config LOCKDEP bool depends on DEBUG_KERNEL && LOCK_DEBUGGING_SUPPORT -- Hmm, given that neither xfs nor btrfs is used, is it aio code?