Re: kpartx mounted vmdk images and bad write performance

James Robertson <j@xxxxxxxxxxxxxxxx> · Tue, 11 Dec 2012 00:23:01 +1100



On 11 December 2012 00:09, James Robertson <j@xxxxxxxxxxxxxxxx> wrote:
> It would appear related to USB, although tests to local storage were
> not that great. As a test I ran dd to create a file on the mounted
> partiton on a USB2 attached disk as follows:
>
>> kpartx -av /mnt/backup/FITZVICVDP01/FITZVICVDP01_2-flat.vmdk
> add map loop0p1 (254:4): 0 536860169 linear /dev/loop0 1
>> mount /dev/mapper/loop0p1 /mnt/loop0p1
>> dd if=/dev/zero of=/mnt/loop0p1/zerofile.tst bs=1k count=1700000
>
> In another terminal I monitored it's progress, it got to 763M and
> simply stopped increasing.  dd was still running in the other
> terminal.  You can see the date is 10 minutes since the last mod time.
>
>> date
> Monday 10 December  23:28:09 EST 2012
>> ls -lh /mnt/loop0p1/zerofile.tst
> -rw-r--r-- 1 root root 763M Dec 10 23:18 /mnt/loop0p1/zerofile.tst
>
> In dmesg these started being logged every 2 minutes.
>
> [ 1680.648037] INFO: task kworker/3:0:3139 blocked for more than 120 seconds.
> [ 1680.648086] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
> disables this message.
> [ 1680.648140] kworker/3:0     D ffff88012fd93780     0  3139      2 0x00000000
> [ 1680.648144]  ffff88012706e300 0000000000000046 0000000700000000
> ffff88012af5cf60
> [ 1680.648149]  0000000000013780 ffff8801275effd8 ffff8801275effd8
> ffff88012706e300
> [ 1680.648153]  ffff880129f5e000 00000001810709c1 0000000000000202
> ffff880129d62e80
> [ 1680.648157] Call Trace:
> [ 1680.648180]  [<ffffffffa0319a43>] ? xlog_wait+0x51/0x67 [xfs]
> [ 1680.648186]  [<ffffffff8103f411>] ? try_to_wake_up+0x197/0x197
> [ 1680.648200]  [<ffffffffa031c1e0>] ? _xfs_log_force_lsn+0x1cd/0x205 [xfs]
> [ 1680.648215]  [<ffffffffa03194b6>] ? xfs_trans_commit+0x10a/0x205 [xfs]
> [ 1680.648227]  [<ffffffffa02e6736>] ? xfs_reclaim_inode+0x22d/0x22d [xfs]
> [ 1680.648239]  [<ffffffffa02e6770>] ? xfs_sync_worker+0x3a/0x6a [xfs]
> [ 1680.648243]  [<ffffffff8105b04d>] ? process_one_work+0x163/0x284
> [ 1680.648246]  [<ffffffff8105c022>] ? worker_thread+0xc2/0x145
> [ 1680.648249]  [<ffffffff8105bf60>] ? manage_workers.isra.25+0x15b/0x15b
> [ 1680.648253]  [<ffffffff8105f155>] ? kthread+0x76/0x7e
> [ 1680.648258]  [<ffffffff81353c34>] ? kernel_thread_helper+0x4/0x10
> [ 1680.648261]  [<ffffffff8105f0df>] ? kthread_worker_fn+0x139/0x139
> [ 1680.648264]  [<ffffffff81353c30>] ? gs_change+0x13/0x13
>
> ---
>
> More info was logged later on before dd finally completed.
>
> [ 2400.648039] INFO: task kworker/3:0:3139 blocked for more than 120 seconds.
> [ 2400.648063] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
> disables this message.
> [ 2400.648100] kworker/3:0     D ffff88012fd93780     0  3139      2 0x00000000
> [ 2400.648105]  ffff88012706e300 0000000000000046 0000000700000000
> ffff88012af5cf60
> [ 2400.648111]  0000000000013780 ffff8801275effd8 ffff8801275effd8
> ffff88012706e300
> [ 2400.648115]  ffff880129f5e000 00000001810709c1 0000000000000202
> ffff880129d62e80
> [ 2400.648120] Call Trace:
> [ 2400.648145]  [<ffffffffa0319a43>] ? xlog_wait+0x51/0x67 [xfs]
> [ 2400.648151]  [<ffffffff8103f411>] ? try_to_wake_up+0x197/0x197
> [ 2400.648168]  [<ffffffffa031c1e0>] ? _xfs_log_force_lsn+0x1cd/0x205 [xfs]
> [ 2400.648185]  [<ffffffffa03194b6>] ? xfs_trans_commit+0x10a/0x205 [xfs]
> [ 2400.648199]  [<ffffffffa02e6736>] ? xfs_reclaim_inode+0x22d/0x22d [xfs]
> [ 2400.648213]  [<ffffffffa02e6770>] ? xfs_sync_worker+0x3a/0x6a [xfs]
> [ 2400.648218]  [<ffffffff8105b04d>] ? process_one_work+0x163/0x284
> [ 2400.648221]  [<ffffffff8105c022>] ? worker_thread+0xc2/0x145
> [ 2400.648225]  [<ffffffff8105bf60>] ? manage_workers.isra.25+0x15b/0x15b
> [ 2400.648229]  [<ffffffff8105f155>] ? kthread+0x76/0x7e
> [ 2400.648234]  [<ffffffff81353c34>] ? kernel_thread_helper+0x4/0x10
> [ 2400.648238]  [<ffffffff8105f0df>] ? kthread_worker_fn+0x139/0x139
> [ 2400.648241]  [<ffffffff81353c30>] ? gs_change+0x13/0x13
> [ 2400.648250] INFO: task dd:3376 blocked for more than 120 seconds.
> [ 2400.648282] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
> disables this message.
> [ 2400.648334] dd              D ffff88012fd93780     0  3376   2715 0x00000000
> [ 2400.648337]  ffff88012755aa70 0000000000000082 ffff880100000000
> ffff88012af5cf60
> [ 2400.648342]  0000000000013780 ffff8801290dffd8 ffff8801290dffd8
> ffff88012755aa70
> [ 2400.648346]  ffffffff8101361a 0000000181065d11 ffff8801294d5c90
> ffff88012fd93fd0
> [ 2400.648349] Call Trace:
> [ 2400.648353]  [<ffffffff8101361a>] ? read_tsc+0x5/0x14
> [ 2400.648357]  [<ffffffff810b41eb>] ? lock_page+0x20/0x20
> [ 2400.648362]  [<ffffffff8134ba54>] ? io_schedule+0x59/0x71
> [ 2400.648365]  [<ffffffff810b41f1>] ? sleep_on_page+0x6/0xa
> [ 2400.648368]  [<ffffffff8134be97>] ? __wait_on_bit+0x3e/0x71
> [ 2400.648371]  [<ffffffff810b432d>] ? wait_on_page_bit+0x6e/0x73
> [ 2400.648374]  [<ffffffff810363d8>] ? should_resched+0x5/0x23
> [ 2400.648377]  [<ffffffff8105f7d1>] ? autoremove_wake_function+0x2a/0x2a
> [ 2400.648381]  [<ffffffff810b4ba6>] ? grab_cache_page_write_begin+0x9c/0xae
> [ 2400.648385]  [<ffffffff8111dc1f>] ? block_write_end+0x57/0x62
> [ 2400.648388]  [<ffffffff8111e22c>] ? block_write_begin+0x22/0x72
> [ 2400.648397]  [<ffffffffa02d79d9>] ? xfs_get_blocks_direct+0xb/0xb [xfs]
> [ 2400.648406]  [<ffffffffa02d8398>] ? xfs_vm_write_begin+0x30/0x51 [xfs]
> [ 2400.648410]  [<ffffffff810b467b>] ? generic_file_buffered_write+0x10f/0x259
> [ 2400.648421]  [<ffffffffa02dcb67>] ?
> xfs_file_buffered_aio_write+0x10b/0x15c [xfs]
> [ 2400.648431]  [<ffffffffa02dcd28>] ? xfs_file_aio_write+0x170/0x20c [xfs]
> [ 2400.648435]  [<ffffffff810f9450>] ? do_sync_write+0xb4/0xec
> [ 2400.648438]  [<ffffffff810363d8>] ? should_resched+0x5/0x23
> [ 2400.648442]  [<ffffffff8134b7ff>] ? _cond_resched+0x7/0x1c
> [ 2400.648446]  [<ffffffff81163719>] ? security_file_permission+0x16/0x2d
> [ 2400.648449]  [<ffffffff810f9b41>] ? vfs_write+0xa2/0xe9
> [ 2400.648452]  [<ffffffff810f9d1e>] ? sys_write+0x45/0x6b
> [ 2400.648456]  [<ffffffff81351ad2>] ? system_call_fastpath+0x16/0x1b
> [ 2520.648039] INFO: task kworker/3:0:3139 blocked for more than 120 seconds.
> [ 2520.648076] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
> disables this message.
> [ 2520.648128] kworker/3:0     D ffff88012fd93780     0  3139      2 0x00000000
> [ 2520.648134]  ffff88012706e300 0000000000000046 0000000700000000
> ffff88012af5cf60
> [ 2520.648139]  0000000000013780 ffff8801275effd8 ffff8801275effd8
> ffff88012706e300
> [ 2520.648144]  ffff880129f5e000 00000001810709c1 0000000000000202
> ffff880129d62e80
> [ 2520.648149] Call Trace:
> [ 2520.648173]  [<ffffffffa0319a43>] ? xlog_wait+0x51/0x67 [xfs]
> [ 2520.648179]  [<ffffffff8103f411>] ? try_to_wake_up+0x197/0x197
> [ 2520.648196]  [<ffffffffa031c1e0>] ? _xfs_log_force_lsn+0x1cd/0x205 [xfs]
> [ 2520.648216]  [<ffffffffa03194b6>] ? xfs_trans_commit+0x10a/0x205 [xfs]
> [ 2520.648226]  [<ffffffffa02e6736>] ? xfs_reclaim_inode+0x22d/0x22d [xfs]
> [ 2520.648236]  [<ffffffffa02e6770>] ? xfs_sync_worker+0x3a/0x6a [xfs]
> [ 2520.648240]  [<ffffffff8105b04d>] ? process_one_work+0x163/0x284
> [ 2520.648243]  [<ffffffff8105c022>] ? worker_thread+0xc2/0x145
> [ 2520.648245]  [<ffffffff8105bf60>] ? manage_workers.isra.25+0x15b/0x15b
> [ 2520.648248]  [<ffffffff8105f155>] ? kthread+0x76/0x7e
> [ 2520.648253]  [<ffffffff81353c34>] ? kernel_thread_helper+0x4/0x10
> [ 2520.648256]  [<ffffffff8105f0df>] ? kthread_worker_fn+0x139/0x139
> [ 2520.648258]  [<ffffffff81353c30>] ? gs_change+0x13/0x13
>
>
> I let dd continue to run and eventually it finished with abysmal
> results (this is the output of the command at the top):
>
>> dd if=/dev/zero of=/mnt/loop0p1/zerofile.tst bs=1k count=1700000
> 1700000+0 records in
> 1700000+0 records out
> 1740800000 bytes (1.7 GB) copied, 1362.68 s, 1.3 MB/s
>
>> ls -lh /mnt/loop0p1/zerofile.tst
> -rw-r--r-- 1 root root 1.7G Dec 10 23:41 /mnt/loop0p1/zerofile.tst
>
> I had already copied the same vmdk to the local server file system and
> repeated the tests.  No errors were logged to dmesg.
>
>> kpartx -av /var/data/FITZVICVDP01_2-flat.vmdk
> add map loop0p1 (254:4): 0 536860169 linear /dev/loop0 1
>> mount /dev/mapper/loop0p1 /mnt/loop0p1
>> dd if=/dev/zero of=/mnt/loop0p1/zerofile.tst bs=1k count=1700000
> 1700000+0 records in
> 1700000+0 records out
> 1740800000 bytes (1.7 GB) copied, 31.867 s, 54.6 MB/s
>
> As a comparison...
>
> Writing direct to the USB disk is about right for USB2.
>
>> dd if=/dev/zero of=/mnt/backup/zerofile.tst bs=1k count=1700000
> 1700000+0 records in
> 1700000+0 records out
> 1740800000 bytes (1.7 GB) copied, 67.1975 s, 25.9 MB/s
>
> And to the servers disk - I should note that this is alot faster than
> the test to the mounted vmdk on the servers disk above - not sure what
> to make of that.
>
>> dd if=/dev/zero of=/var/data/zerofile.tst bs=1k count=1700000
> 1700000+0 records in
> 1700000+0 records out
> 1740800000 bytes (1.7 GB) copied, 5.61148 s, 310 MB/s
>
> Thanks

These are the results with losetup (no errors or hang ups)

> losetup /dev/loop0 /var/data/FITZVICVDP01_2-flat.vmdk
> losetup -o 512 /dev/loop1 /dev/loop0
> mount /dev/loop1 /mnt/loop0p1/
> dd if=/dev/zero of=/mnt/loop0p1/zerofile.tst bs=1k count=1700000
1700000+0 records in
1700000+0 records out
1740800000 bytes (1.7 GB) copied, 25.36 s, 68.6 MB/s

> losetup /dev/loop0 /mnt/backup/FITZVICVDP01/FITZVICVDP01_2-flat.vmdk
> losetup -o 512 /dev/loop1 /dev/loop0
> mount /dev/loop1 /mnt/loop0p1/
> dd if=/dev/zero of=/mnt/loop0p1/zerofile.tst bs=1k count=1700000
1700000+0 records in
1700000+0 records out
1740800000 bytes (1.7 GB) copied, 81.9595 s, 21.2 MB/s

--
dm-devel mailing list
dm-devel@xxxxxxxxxx
https://www.redhat.com/mailman/listinfo/dm-devel