MMC: bonnie++ runs with errors after switching mmc to blk-mq

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Ulf, Adrian,

returning to our conversation about MMC blk-mq issues
https://www.mail-archive.com/linux-snps-arc@xxxxxxxxxxxxxxxxxxx/msg03330.html
I
played a bit with IO schedulers, bonnie++ and MMC on newer kernel (v4.19)


I still can reproduce errors about hung task when I run bonnie++ benchmark on arc/hsdk board
(ARC HS38 CPU, Synopsys DW MMC controller).

Error message
is like:
----------------------------->8----------------------------
# mount /dev/mmcblk0p4 /mnt/mmcblk0p4
EXT4-fs (mmcblk0p4): mounted filesystem with
ordered data mode. Opts: (null)
# bonnie++ -u root -r 256 -s 512 -x 1  -d /mnt/mmcblk0p4
Using uid:0, gid:0.
Writing with putc()...done
Writing
intelligently...done
Rewriting...INFO: task bonnie++:187 blocked for more than 10 seconds.
      Not tainted 4.19.0 #2
"echo 0 >
/proc/sys/kernel/hung_task_timeout_secs" disables this message.
bonnie++        D    0   187    130 0x00000000

Stack Trace:
  __switch_to+0x0/0xac
 
__schedule+0x1b4/0x738
  io_schedule+0x5c/0xc0
  bit_wait_io+0xc/0x54
  out_of_line_wait_on_bit+0x76/0xbc
  do_get_write_access+0x1a4/0x46c
 
jbd2_journal_get_write_access+0x32/0x74
  __ext4_journal_get_write_access+0x40/0x88
  ext4_mark_inode_dirty+0x90/0x18c
  ext4_dirty_inode+0x32/0x5c
 
__mark_inode_dirty+0x2a/0x1f0
  generic_update_time+0xa6/0xd0
  touch_atime+0x164/0x250
  generic_file_read_iter+0x826/0xbcc
  sys_read+0x26a/0x2c4
 
EV_Trap+0x110/0x114
----------------------------->8----------------------------




Moreover I'm able to reproduce it on CubieBoard2 which has completely
different HW:
(ARM Cortex-A7 CPU, 'allwinner,sun7i-a20-mmc' MMC controller)

Error message is like:
----------------------------->8-----------------------
-----
bonnie++ -u root -r 256 -s 512 -x 1 -d /mnt/mmcblk0p3
Using uid:0, gid:0.
Writing with putc()...done
[ 6402.494075] INFO: task jbd2/mmcblk0p3-:106
blocked for more than 10 seconds.
[ 6402.501047]       Not tainted 4.19.1 #1
[ 6402.504918] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables
this message.
[ 6402.512741] jbd2/mmcblk0p3- D    0   106      2 0x00000000
[ 6402.518307] [<c06841a4>] (__schedule) from [<c06845f4>]
(schedule+0x40/0xa0)
[ 6402.525381] [<c06845f4>] (schedule) from [<c01444c4>] (io_schedule+0x14/0x34)
[ 6402.532519] [<c01444c4>] (io_schedule) from
[<c01ba25c>] (wait_on_page_bit+0x124/0x15c)
[ 6402.540548] [<c01ba25c>] (wait_on_page_bit) from [<c01ba364>] (__filemap_fdatawait_range+0xd0/0x118)
[
6402.549697] [<c01ba364>] (__filemap_fdatawait_range) from [<c01ba3f8>] (filemap_fdatawait_keep_errors+0x24/0x50)
[ 6402.559885] [<c01ba3f8>]
(filemap_fdatawait_keep_errors) from [<c02cc3d0>] (jbd2_journal_commit_transaction+0x9b8/0x167c)
[ 6402.570857] [<c02cc3d0>]
(jbd2_journal_commit_transaction) from [<c02d0478>] (kjournald2+0xe4/0x2c8)
[ 6402.580011] [<c02d0478>] (kjournald2) from [<c013ab04>]
(kthread+0x148/0x150)
[ 6402.587166] [<c013ab04>] (kthread) from [<c01010e8>] (ret_from_fork+0x14/0x2c)
[ 6402.594396] Exception stack(0xeea9bfb0 to
0xeea9bff8)
[ 6402.599452] bfa0:                                     00000000 00000000 00000000 00000000
[ 6402.607640] bfc0: 00000000 00000000 00000000
00000000 00000000 00000000 00000000 00000000
[ 6402.615825] bfe0: 00000000 00000000 00000000 00000000 00000013 00000000
----------------------------->8-
---------------------------


NOTE: this happens not every time but probably one in a 20-30 bonnie++ runs.

I tried to use both 'mq-deadline' and 'bfq' io-
schedulers and this issue is reproduced with both of them.
I tried to use different SD cards (just in case) and this issue is reproduced with SD cards
from different vendors.

In all cases 'DETECT_HUNG_TASK' config option was enabled for all boards (it is disabled by default)
-------------------------
---->8----------------------------
DETECT_HUNG_TASK=y
DEFAULT_HUNG_TASK_TIMEOUT=10
----------------------------->8----------------------------


I don't
think that this issue is somehow HW or platform related
given that it is reproduced on completely different HW stacks.
Probably we don't see a lot of
reports about it because hang task
detection is debug feature which is disabled in most of cases.


Just to remind: git bisect give us that issue appears
after
commit 81196976ed94 (mmc: block: Add blk-mq support).


 @Ulf, could I ask you to take look at this?


Thanks.
-- 
 Eugeniy Paltsev




[Index of Archives]     [Linux Memonry Technology]     [Linux USB Devel]     [Linux Media]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux