Greetings, We are hitting the following hung-task panic[1] with raid5 in kernel 4.14.99. It is happening every couple of days. The raid5 in question contains three devices and has been created with command: mdadm --create /dev/md5 --force --raid-devices=3 --size=1522566M --chunk=64 --level=raid5 --bitmap=internal --name=5 --uuid=47952090192D4408BDABC9628E16FD06 --run --auto=md --metadata=1.2 --homehost=zadara_vc --verbose --verbose /dev/dm-13 /dev/dm-14 /dev/dm-15 The array is not undergoing any kind of rebuild or reshape. Similar issue for kernel 4.14.37 was reported in https://bugzilla.kernel.org/show_bug.cgi?id=199539. We recently moved to kernel 4.14 (long term kernel) from kernel 3.18. With kernel 3.18 we haven't seen this issue. Looking at the code, raid5_make_request seems to be stuck waiting for a free stripe via raid5_make_request => raid5_get_active_stripe => wait_event_lock_irq(). Looking with gdb: (gdb) l *raid5_make_request+0x1b7 0xa697 is in raid5_make_request (./include/linux/compiler.h:183). 178 }) 179 180 static __always_inline 181 void __read_once_size(const volatile void *p, void *res, int size) 182 { 183 __READ_ONCE_SIZE; 184 } 185 186 #ifdef CONFIG_KASAN 187 /* The READ_ONCE call seems to be used by list_empty, which is called from wait_event_lock_irq [2] How can this be debugged further? Thanks, Alex. [1] [155653.946408] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [155653.947333] kworker/u4:94 D 0 6178 2 0x80000000 [155653.949159] Call Trace: [155653.949576] ? __schedule+0x290/0x8a0 [155653.950052] ? blk_flush_plug_list+0xc1/0x250 [155653.950688] schedule+0x2f/0x90 [155653.951173] raid5_make_request+0x1b7/0xb10 [raid456] [155653.951765] ? wait_woken+0x80/0x80 [155653.952216] ? wait_woken+0x80/0x80 [155653.952673] md_handle_request+0x131/0x1a0 [md_mod] [155653.953310] md_make_request+0x65/0x170 [md_mod] [155653.953963] generic_make_request+0x123/0x320 [155653.954473] ? submit_bio+0x6c/0x140 [155653.954981] submit_bio+0x6c/0x140 [2] if (!sh) { set_bit(R5_INACTIVE_BLOCKED, &conf->cache_state); r5l_wake_reclaim(conf->log, 0); wait_event_lock_irq( conf->wait_for_stripe, !list_empty(conf->inactive_list + hash) && (atomic_read(&conf->active_stripes) < (conf->max_nr_stripes * 3 / 4) || !test_bit(R5_INACTIVE_BLOCKED, &conf->cache_state)), *(conf->hash_locks + hash));