On 11 December 2012 00:09, James Robertson <j@xxxxxxxxxxxxxxxx> wrote: > It would appear related to USB, although tests to local storage were > not that great. As a test I ran dd to create a file on the mounted > partiton on a USB2 attached disk as follows: > >> kpartx -av /mnt/backup/FITZVICVDP01/FITZVICVDP01_2-flat.vmdk > add map loop0p1 (254:4): 0 536860169 linear /dev/loop0 1 >> mount /dev/mapper/loop0p1 /mnt/loop0p1 >> dd if=/dev/zero of=/mnt/loop0p1/zerofile.tst bs=1k count=1700000 > > In another terminal I monitored it's progress, it got to 763M and > simply stopped increasing. dd was still running in the other > terminal. You can see the date is 10 minutes since the last mod time. > >> date > Monday 10 December 23:28:09 EST 2012 >> ls -lh /mnt/loop0p1/zerofile.tst > -rw-r--r-- 1 root root 763M Dec 10 23:18 /mnt/loop0p1/zerofile.tst > > In dmesg these started being logged every 2 minutes. > > [ 1680.648037] INFO: task kworker/3:0:3139 blocked for more than 120 seconds. > [ 1680.648086] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" > disables this message. > [ 1680.648140] kworker/3:0 D ffff88012fd93780 0 3139 2 0x00000000 > [ 1680.648144] ffff88012706e300 0000000000000046 0000000700000000 > ffff88012af5cf60 > [ 1680.648149] 0000000000013780 ffff8801275effd8 ffff8801275effd8 > ffff88012706e300 > [ 1680.648153] ffff880129f5e000 00000001810709c1 0000000000000202 > ffff880129d62e80 > [ 1680.648157] Call Trace: > [ 1680.648180] [<ffffffffa0319a43>] ? xlog_wait+0x51/0x67 [xfs] > [ 1680.648186] [<ffffffff8103f411>] ? try_to_wake_up+0x197/0x197 > [ 1680.648200] [<ffffffffa031c1e0>] ? _xfs_log_force_lsn+0x1cd/0x205 [xfs] > [ 1680.648215] [<ffffffffa03194b6>] ? xfs_trans_commit+0x10a/0x205 [xfs] > [ 1680.648227] [<ffffffffa02e6736>] ? xfs_reclaim_inode+0x22d/0x22d [xfs] > [ 1680.648239] [<ffffffffa02e6770>] ? xfs_sync_worker+0x3a/0x6a [xfs] > [ 1680.648243] [<ffffffff8105b04d>] ? process_one_work+0x163/0x284 > [ 1680.648246] [<ffffffff8105c022>] ? worker_thread+0xc2/0x145 > [ 1680.648249] [<ffffffff8105bf60>] ? manage_workers.isra.25+0x15b/0x15b > [ 1680.648253] [<ffffffff8105f155>] ? kthread+0x76/0x7e > [ 1680.648258] [<ffffffff81353c34>] ? kernel_thread_helper+0x4/0x10 > [ 1680.648261] [<ffffffff8105f0df>] ? kthread_worker_fn+0x139/0x139 > [ 1680.648264] [<ffffffff81353c30>] ? gs_change+0x13/0x13 > > --- > > More info was logged later on before dd finally completed. > > [ 2400.648039] INFO: task kworker/3:0:3139 blocked for more than 120 seconds. > [ 2400.648063] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" > disables this message. > [ 2400.648100] kworker/3:0 D ffff88012fd93780 0 3139 2 0x00000000 > [ 2400.648105] ffff88012706e300 0000000000000046 0000000700000000 > ffff88012af5cf60 > [ 2400.648111] 0000000000013780 ffff8801275effd8 ffff8801275effd8 > ffff88012706e300 > [ 2400.648115] ffff880129f5e000 00000001810709c1 0000000000000202 > ffff880129d62e80 > [ 2400.648120] Call Trace: > [ 2400.648145] [<ffffffffa0319a43>] ? xlog_wait+0x51/0x67 [xfs] > [ 2400.648151] [<ffffffff8103f411>] ? try_to_wake_up+0x197/0x197 > [ 2400.648168] [<ffffffffa031c1e0>] ? _xfs_log_force_lsn+0x1cd/0x205 [xfs] > [ 2400.648185] [<ffffffffa03194b6>] ? xfs_trans_commit+0x10a/0x205 [xfs] > [ 2400.648199] [<ffffffffa02e6736>] ? xfs_reclaim_inode+0x22d/0x22d [xfs] > [ 2400.648213] [<ffffffffa02e6770>] ? xfs_sync_worker+0x3a/0x6a [xfs] > [ 2400.648218] [<ffffffff8105b04d>] ? process_one_work+0x163/0x284 > [ 2400.648221] [<ffffffff8105c022>] ? worker_thread+0xc2/0x145 > [ 2400.648225] [<ffffffff8105bf60>] ? manage_workers.isra.25+0x15b/0x15b > [ 2400.648229] [<ffffffff8105f155>] ? kthread+0x76/0x7e > [ 2400.648234] [<ffffffff81353c34>] ? kernel_thread_helper+0x4/0x10 > [ 2400.648238] [<ffffffff8105f0df>] ? kthread_worker_fn+0x139/0x139 > [ 2400.648241] [<ffffffff81353c30>] ? gs_change+0x13/0x13 > [ 2400.648250] INFO: task dd:3376 blocked for more than 120 seconds. > [ 2400.648282] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" > disables this message. > [ 2400.648334] dd D ffff88012fd93780 0 3376 2715 0x00000000 > [ 2400.648337] ffff88012755aa70 0000000000000082 ffff880100000000 > ffff88012af5cf60 > [ 2400.648342] 0000000000013780 ffff8801290dffd8 ffff8801290dffd8 > ffff88012755aa70 > [ 2400.648346] ffffffff8101361a 0000000181065d11 ffff8801294d5c90 > ffff88012fd93fd0 > [ 2400.648349] Call Trace: > [ 2400.648353] [<ffffffff8101361a>] ? read_tsc+0x5/0x14 > [ 2400.648357] [<ffffffff810b41eb>] ? lock_page+0x20/0x20 > [ 2400.648362] [<ffffffff8134ba54>] ? io_schedule+0x59/0x71 > [ 2400.648365] [<ffffffff810b41f1>] ? sleep_on_page+0x6/0xa > [ 2400.648368] [<ffffffff8134be97>] ? __wait_on_bit+0x3e/0x71 > [ 2400.648371] [<ffffffff810b432d>] ? wait_on_page_bit+0x6e/0x73 > [ 2400.648374] [<ffffffff810363d8>] ? should_resched+0x5/0x23 > [ 2400.648377] [<ffffffff8105f7d1>] ? autoremove_wake_function+0x2a/0x2a > [ 2400.648381] [<ffffffff810b4ba6>] ? grab_cache_page_write_begin+0x9c/0xae > [ 2400.648385] [<ffffffff8111dc1f>] ? block_write_end+0x57/0x62 > [ 2400.648388] [<ffffffff8111e22c>] ? block_write_begin+0x22/0x72 > [ 2400.648397] [<ffffffffa02d79d9>] ? xfs_get_blocks_direct+0xb/0xb [xfs] > [ 2400.648406] [<ffffffffa02d8398>] ? xfs_vm_write_begin+0x30/0x51 [xfs] > [ 2400.648410] [<ffffffff810b467b>] ? generic_file_buffered_write+0x10f/0x259 > [ 2400.648421] [<ffffffffa02dcb67>] ? > xfs_file_buffered_aio_write+0x10b/0x15c [xfs] > [ 2400.648431] [<ffffffffa02dcd28>] ? xfs_file_aio_write+0x170/0x20c [xfs] > [ 2400.648435] [<ffffffff810f9450>] ? do_sync_write+0xb4/0xec > [ 2400.648438] [<ffffffff810363d8>] ? should_resched+0x5/0x23 > [ 2400.648442] [<ffffffff8134b7ff>] ? _cond_resched+0x7/0x1c > [ 2400.648446] [<ffffffff81163719>] ? security_file_permission+0x16/0x2d > [ 2400.648449] [<ffffffff810f9b41>] ? vfs_write+0xa2/0xe9 > [ 2400.648452] [<ffffffff810f9d1e>] ? sys_write+0x45/0x6b > [ 2400.648456] [<ffffffff81351ad2>] ? system_call_fastpath+0x16/0x1b > [ 2520.648039] INFO: task kworker/3:0:3139 blocked for more than 120 seconds. > [ 2520.648076] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" > disables this message. > [ 2520.648128] kworker/3:0 D ffff88012fd93780 0 3139 2 0x00000000 > [ 2520.648134] ffff88012706e300 0000000000000046 0000000700000000 > ffff88012af5cf60 > [ 2520.648139] 0000000000013780 ffff8801275effd8 ffff8801275effd8 > ffff88012706e300 > [ 2520.648144] ffff880129f5e000 00000001810709c1 0000000000000202 > ffff880129d62e80 > [ 2520.648149] Call Trace: > [ 2520.648173] [<ffffffffa0319a43>] ? xlog_wait+0x51/0x67 [xfs] > [ 2520.648179] [<ffffffff8103f411>] ? try_to_wake_up+0x197/0x197 > [ 2520.648196] [<ffffffffa031c1e0>] ? _xfs_log_force_lsn+0x1cd/0x205 [xfs] > [ 2520.648216] [<ffffffffa03194b6>] ? xfs_trans_commit+0x10a/0x205 [xfs] > [ 2520.648226] [<ffffffffa02e6736>] ? xfs_reclaim_inode+0x22d/0x22d [xfs] > [ 2520.648236] [<ffffffffa02e6770>] ? xfs_sync_worker+0x3a/0x6a [xfs] > [ 2520.648240] [<ffffffff8105b04d>] ? process_one_work+0x163/0x284 > [ 2520.648243] [<ffffffff8105c022>] ? worker_thread+0xc2/0x145 > [ 2520.648245] [<ffffffff8105bf60>] ? manage_workers.isra.25+0x15b/0x15b > [ 2520.648248] [<ffffffff8105f155>] ? kthread+0x76/0x7e > [ 2520.648253] [<ffffffff81353c34>] ? kernel_thread_helper+0x4/0x10 > [ 2520.648256] [<ffffffff8105f0df>] ? kthread_worker_fn+0x139/0x139 > [ 2520.648258] [<ffffffff81353c30>] ? gs_change+0x13/0x13 > > > I let dd continue to run and eventually it finished with abysmal > results (this is the output of the command at the top): > >> dd if=/dev/zero of=/mnt/loop0p1/zerofile.tst bs=1k count=1700000 > 1700000+0 records in > 1700000+0 records out > 1740800000 bytes (1.7 GB) copied, 1362.68 s, 1.3 MB/s > >> ls -lh /mnt/loop0p1/zerofile.tst > -rw-r--r-- 1 root root 1.7G Dec 10 23:41 /mnt/loop0p1/zerofile.tst > > I had already copied the same vmdk to the local server file system and > repeated the tests. No errors were logged to dmesg. > >> kpartx -av /var/data/FITZVICVDP01_2-flat.vmdk > add map loop0p1 (254:4): 0 536860169 linear /dev/loop0 1 >> mount /dev/mapper/loop0p1 /mnt/loop0p1 >> dd if=/dev/zero of=/mnt/loop0p1/zerofile.tst bs=1k count=1700000 > 1700000+0 records in > 1700000+0 records out > 1740800000 bytes (1.7 GB) copied, 31.867 s, 54.6 MB/s > > As a comparison... > > Writing direct to the USB disk is about right for USB2. > >> dd if=/dev/zero of=/mnt/backup/zerofile.tst bs=1k count=1700000 > 1700000+0 records in > 1700000+0 records out > 1740800000 bytes (1.7 GB) copied, 67.1975 s, 25.9 MB/s > > And to the servers disk - I should note that this is alot faster than > the test to the mounted vmdk on the servers disk above - not sure what > to make of that. > >> dd if=/dev/zero of=/var/data/zerofile.tst bs=1k count=1700000 > 1700000+0 records in > 1700000+0 records out > 1740800000 bytes (1.7 GB) copied, 5.61148 s, 310 MB/s > > Thanks These are the results with losetup (no errors or hang ups) > losetup /dev/loop0 /var/data/FITZVICVDP01_2-flat.vmdk > losetup -o 512 /dev/loop1 /dev/loop0 > mount /dev/loop1 /mnt/loop0p1/ > dd if=/dev/zero of=/mnt/loop0p1/zerofile.tst bs=1k count=1700000 1700000+0 records in 1700000+0 records out 1740800000 bytes (1.7 GB) copied, 25.36 s, 68.6 MB/s > losetup /dev/loop0 /mnt/backup/FITZVICVDP01/FITZVICVDP01_2-flat.vmdk > losetup -o 512 /dev/loop1 /dev/loop0 > mount /dev/loop1 /mnt/loop0p1/ > dd if=/dev/zero of=/mnt/loop0p1/zerofile.tst bs=1k count=1700000 1700000+0 records in 1700000+0 records out 1740800000 bytes (1.7 GB) copied, 81.9595 s, 21.2 MB/s -- dm-devel mailing list dm-devel@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/dm-devel