Re: BUG: Hung task timeouts in for-4.10/dio

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Jens,

On 11/9/16 05:21, Jens Axboe wrote:
> On 11/08/2016 12:55 PM, Logan Gunthorpe wrote:
>> Hey,
>>
>> I've attached the output of dmesg from a working boot and the output of
>> mount.
>>
>> Pretty much all the file systems are ext4. We have some experimental
>> nvme devices in this system which I did try removing to eliminate that
>> possibility.
>>
>> Let me know if you need anything else.
> 
> You're using dm, that might be related. Mike, have you tried booting
> for-4.10/block and checking if dm works fine?

Using yesterday's tree, I experienced similar problems with
for-4.10/block without using dm (using ext4 on top of SSDs): random
tasks hung, starting from boot, with the machine eventually completely
freezing.

I did not dig into the problem a lot. I just looked at task stack traces
(echo t > /proc/sysrq-trigger) and noticed that hung tasks are waiting
for requests. Ex:

[   55.356418] plymouthd       D ffffffff81671758     0   353      1
0x00000000
[   55.356419]  ffff8807fbf1ec00 0000000000000000 ffff8807fba6d500
ffff8807fba3b600
[   55.356420]  ffff88081fb97900 ffff8807f04079a8 ffffffff81671758
000000000000158f
[   55.356421]  0000000000000000 ffff8807f3373800 ffff8807fba3b600
ffff88081fb97900
[   55.356421] Call Trace:
[   55.356421]  [<ffffffff81671758>] ? __schedule+0x178/0x650
[   55.356422]  [<ffffffff81671c70>] schedule+0x40/0x90
[   55.356423]  [<ffffffff816749d1>] schedule_timeout+0x2b1/0x3e0
[   55.356424]  [<ffffffff8115419d>] ? mempool_alloc_slab+0x1d/0x30
[   55.356425]  [<ffffffff810e0971>] ? ktime_get+0x41/0xb0
[   55.356426]  [<ffffffff81671574>] io_schedule_timeout+0xa4/0x110
[   55.356427]  [<ffffffff8130ee2b>] get_request+0x3fb/0x7d0
[   55.356428]  [<ffffffff8120fd83>] ? __find_get_block+0xf3/0x180
[   55.356429]  [<ffffffff810be260>] ? wait_woken+0x90/0x90
[   55.356431]  [<ffffffff813117cb>] blk_queue_bio+0xfb/0x3c0
[   55.356432]  [<ffffffff8130fb90>] generic_make_request+0xd0/0x180
[   55.356433]  [<ffffffff8130fcac>] submit_bio+0x6c/0x130
[   55.356436]  [<ffffffff81270f08>] ext4_io_submit+0x38/0x50
[   55.356437]  [<ffffffff8126c241>] ext4_writepages+0x561/0xdb0
[   55.356439]  [<ffffffff811601e1>] do_writepages+0x21/0x30
[   55.356440]  [<ffffffff811520aa>] __filemap_fdatawrite_range+0xaa/0xf0
[   55.356440]  [<ffffffff811524df>] ? __generic_file_write_iter+0x14f/0x1d0
[   55.356441]  [<ffffffff8115213c>] filemap_flush+0x1c/0x20
[   55.356442]  [<ffffffff812698bc>] ext4_alloc_da_blocks+0x2c/0x80
[   55.356443]  [<ffffffff81262268>] ext4_release_file+0x78/0xc0
[   55.356446]  [<ffffffff811db2a9>] __fput+0xb9/0x200
[   55.356447]  [<ffffffff811db42e>] ____fput+0xe/0x10
[   55.356449]  [<ffffffff81097bf5>] task_work_run+0x85/0xb0
[   55.356450]  [<ffffffff810016a7>] exit_to_usermode_loop+0x97/0xa0
[   55.356451]  [<ffffffff810019e3>] syscall_return_slowpath+0x53/0x60
[   55.356452]  [<ffffffff8167605f>] entry_SYSCALL_64_fastpath+0x92/0x94

I needed the ZBC code so I detached the head back to 5f2808f and
everything then worked fine. I will try to bisect.

Best regards.

-- 
Damien Le Moal, Ph.D.
Sr. Manager, System Software Research Group,
Western Digital Corporation
Damien.LeMoal@xxxxxxx
(+81) 0466-98-3593 (ext. 513593)
1 kirihara-cho, Fujisawa,
Kanagawa, 252-0888 Japan
www.wdc.com, www.hgst.com
--
To unsubscribe from this list: send the line "unsubscribe linux-block" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [Linux RAID]     [Linux SCSI]     [Linux ATA RAID]     [IDE]     [Linux Wireless]     [Linux Kernel]     [ATH6KL]     [Linux Bluetooth]     [Linux Netdev]     [Kernel Newbies]     [Security]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Device Mapper]

  Powered by Linux