Right, I've stopped the tests because it is just getting ridiculous. Without rbd cache enabled, dd tests run extremely slow: dd if=/dev/zero of=/tmp/mount/1G bs=1M count=1000 oflag=direct 230+0 records in 230+0 records out 241172480 bytes (241 MB) copied, 929.71 s, 259 kB/s Any thoughts why I am getting 250kb/s instead of expected 100MB/s+ with large block size? How do I investigate what's causing this crappy performance? Cheers Andrei ----- Original Message ----- > From: "Andrei Mikhailovsky" <andrei at arhont.com> > To: "Micha Krause" <micha at krausam.de> > Cc: ceph-users at lists.ceph.com > Sent: Thursday, 25 September, 2014 10:58:07 AM > Subject: Re: Frequent Crashes on rbd to nfs gateway > Server > Guys, > Have done some testing with 3.16.3-031603-generic downloaded from > Ubuntu utopic branch. The hang task problem is gone when using large > block size (tested with 1M and 4M) and I could no longer preproduce > the hang tasks while doing 100 dd tests in a for loop. > However, I can confirm that I am still getting hang tasks while > working with a 4K block size. The hang tasks start after about an > hour, but they do not cause the server crash. After a while the dd > test times out and continues with the loop. This is what I was > running: > for i in {1..100} ; do time dd if=/dev/zero of=/tmp/mount/1G bs=4K > count=25K oflag=direct ; done > The following test definately produces the hang tasks like these: > [23160.549785] INFO: task dd:2033 blocked for more than 120 seconds. > [23160.588364] Tainted: G OE 3.16.3-031603-generic #201409171435 > [23160.627998] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" > disables this message. > [23160.706856] dd D 000000000000000b 0 2033 23859 0x00000000 > [23160.706861] ffff88011cec78c8 0000000000000082 ffff88011cec78d8 > ffff88011cec7fd8 > [23160.706865] 00000000000143c0 00000000000143c0 ffff88048661bcc0 > ffff880113441440 > [23160.706868] ffff88011cec7898 ffff88067fd54cc0 ffff880113441440 > ffff880113441440 > [23160.706871] Call Trace: > [23160.706883] [<ffffffff81791f69>] schedule+0x29/0x70 > [23160.706887] [<ffffffff8179203f>] io_schedule+0x8f/0xd0 > [23160.706893] [<ffffffff81219e74>] dio_await_completion+0x54/0xd0 > [23160.706897] [<ffffffff8121c6a8>] do_blockdev_direct_IO+0x958/0xcc0 > [23160.706903] [<ffffffff810ba81e>] ? wake_up_bit+0x2e/0x40 > [23160.706908] [<ffffffff812aa865>] ? > jbd2_journal_dirty_metadata+0xc5/0x260 > [23160.706914] [<ffffffff81265320>] ? ext4_get_block_write+0x20/0x20 > [23160.706919] [<ffffffff8121ca5c>] __blockdev_direct_IO+0x4c/0x50 > [23160.706922] [<ffffffff81265320>] ? ext4_get_block_write+0x20/0x20 > [23160.706928] [<ffffffff8129f44e>] ext4_ind_direct_IO+0xce/0x410 > [23160.706931] [<ffffffff81265320>] ? ext4_get_block_write+0x20/0x20 > [23160.706935] [<ffffffff81261fbb>] ext4_ext_direct_IO+0x1bb/0x2a0 > [23160.706938] [<ffffffff81290158>] ? __ext4_journal_stop+0x78/0xa0 > [23160.706942] [<ffffffff812627fc>] ext4_direct_IO+0xec/0x1e0 > [23160.706946] [<ffffffff8120a003>] ? __mark_inode_dirty+0x53/0x2d0 > [23160.706952] [<ffffffff8116d39b>] > generic_file_direct_write+0xbb/0x180 > [23160.706957] [<ffffffff811ffbe2>] ? mnt_clone_write+0x12/0x30 > [23160.706960] [<ffffffff8116d707>] > __generic_file_write_iter+0x2a7/0x350 > [23160.706963] [<ffffffff8125c2b1>] ext4_file_write_iter+0x111/0x3d0 > [23160.706969] [<ffffffff81192fd4>] ? iov_iter_init+0x14/0x40 > [23160.706976] [<ffffffff811e0c8b>] new_sync_write+0x7b/0xb0 > [23160.706978] [<ffffffff811e19a7>] vfs_write+0xc7/0x1f0 > [23160.706980] [<ffffffff811e1eaf>] SyS_write+0x4f/0xb0 > [23160.706985] [<ffffffff81795ded>] system_call_fastpath+0x1a/0x1f > [23280.705400] INFO: task dd:2033 blocked for more than 120 seconds. > [23280.745358] Tainted: G OE 3.16.3-031603-generic #201409171435 > [23280.785069] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" > disables this message. > [23280.864158] dd D 000000000000000b 0 2033 23859 0x00000000 > [23280.864164] ffff88011cec78c8 0000000000000082 ffff88011cec78d8 > ffff88011cec7fd8 > [23280.864167] 00000000000143c0 00000000000143c0 ffff88048661bcc0 > ffff880113441440 > [23280.864170] ffff88011cec7898 ffff88067fd54cc0 ffff880113441440 > ffff880113441440 > [23280.864173] Call Trace: > [23280.864185] [<ffffffff81791f69>] schedule+0x29/0x70 > [23280.864197] [<ffffffff8179203f>] io_schedule+0x8f/0xd0 > [23280.864203] [<ffffffff81219e74>] dio_await_completion+0x54/0xd0 > [23280.864207] [<ffffffff8121c6a8>] do_blockdev_direct_IO+0x958/0xcc0 > [23280.864213] [<ffffffff810ba81e>] ? wake_up_bit+0x2e/0x40 > [23280.864218] [<ffffffff812aa865>] ? > jbd2_journal_dirty_metadata+0xc5/0x260 > [23280.864224] [<ffffffff81265320>] ? ext4_get_block_write+0x20/0x20 > [23280.864229] [<ffffffff8121ca5c>] __blockdev_direct_IO+0x4c/0x50 > [23280.864239] [<ffffffff81265320>] ? ext4_get_block_write+0x20/0x20 > [23280.864244] [<ffffffff8129f44e>] ext4_ind_direct_IO+0xce/0x410 > [23280.864247] [<ffffffff81265320>] ? ext4_get_block_write+0x20/0x20 > [23280.864251] [<ffffffff81261fbb>] ext4_ext_direct_IO+0x1bb/0x2a0 > [23280.864254] [<ffffffff81290158>] ? __ext4_journal_stop+0x78/0xa0 > [23280.864258] [<ffffffff812627fc>] ext4_direct_IO+0xec/0x1e0 > [23280.864263] [<ffffffff8120a003>] ? __mark_inode_dirty+0x53/0x2d0 > [23280.864268] [<ffffffff8116d39b>] > generic_file_direct_write+0xbb/0x180 > [23280.864273] [<ffffffff811ffbe2>] ? mnt_clone_write+0x12/0x30 > [23280.864284] [<ffffffff8116d707>] > __generic_file_write_iter+0x2a7/0x350 > [23280.864289] [<ffffffff8125c2b1>] ext4_file_write_iter+0x111/0x3d0 > [23280.864295] [<ffffffff81192fd4>] ? iov_iter_init+0x14/0x40 > [23280.864300] [<ffffffff811e0c8b>] new_sync_write+0x7b/0xb0 > [23280.864302] [<ffffffff811e19a7>] vfs_write+0xc7/0x1f0 > [23280.864307] [<ffffffff811e1eaf>] SyS_write+0x4f/0xb0 > [23280.864314] [<ffffffff81795ded>] system_call_fastpath+0x1a/0x1f > [23400.861043] INFO: task dd:2033 blocked for more than 120 seconds. > [23400.901529] Tainted: G OE 3.16.3-031603-generic #201409171435 > [23400.942255] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" > disables this message. > [23401.020985] dd D 000000000000000b 0 2033 23859 0x00000000 > [23401.020991] ffff88011cec78c8 0000000000000082 ffff88011cec78d8 > ffff88011cec7fd8 > [23401.020995] 00000000000143c0 00000000000143c0 ffff88048661bcc0 > ffff880113441440 > [23401.020997] ffff88011cec7898 ffff88067fd54cc0 ffff880113441440 > ffff880113441440 > [23401.021001] Call Trace: > [23401.021014] [<ffffffff81791f69>] schedule+0x29/0x70 > [23401.021025] [<ffffffff8179203f>] io_schedule+0x8f/0xd0 > [23401.021031] [<ffffffff81219e74>] dio_await_completion+0x54/0xd0 > [23401.021035] [<ffffffff8121c6a8>] do_blockdev_direct_IO+0x958/0xcc0 > [23401.021041] [<ffffffff810ba81e>] ? wake_up_bit+0x2e/0x40 > [23401.021046] [<ffffffff812aa865>] ? > jbd2_journal_dirty_metadata+0xc5/0x260 > [23401.021052] [<ffffffff81265320>] ? ext4_get_block_write+0x20/0x20 > [23401.021057] [<ffffffff8121ca5c>] __blockdev_direct_IO+0x4c/0x50 > [23401.021067] [<ffffffff81265320>] ? ext4_get_block_write+0x20/0x20 > [23401.021072] [<ffffffff8129f44e>] ext4_ind_direct_IO+0xce/0x410 > [23401.021075] [<ffffffff81265320>] ? ext4_get_block_write+0x20/0x20 > [23401.021079] [<ffffffff81261fbb>] ext4_ext_direct_IO+0x1bb/0x2a0 > [23401.021082] [<ffffffff81290158>] ? __ext4_journal_stop+0x78/0xa0 > [23401.021086] [<ffffffff812627fc>] ext4_direct_IO+0xec/0x1e0 > [23401.021091] [<ffffffff8120a003>] ? __mark_inode_dirty+0x53/0x2d0 > [23401.021096] [<ffffffff8116d39b>] > generic_file_direct_write+0xbb/0x180 > [23401.021101] [<ffffffff811ffbe2>] ? mnt_clone_write+0x12/0x30 > [23401.021112] [<ffffffff8116d707>] > __generic_file_write_iter+0x2a7/0x350 > [23401.021116] [<ffffffff8125c2b1>] ext4_file_write_iter+0x111/0x3d0 > [23401.021122] [<ffffffff81192fd4>] ? iov_iter_init+0x14/0x40 > [23401.021129] [<ffffffff811e0c8b>] new_sync_write+0x7b/0xb0 > [23401.021134] [<ffffffff811e19a7>] vfs_write+0xc7/0x1f0 > [23401.021137] [<ffffffff811e1eaf>] SyS_write+0x4f/0xb0 > [23401.021141] [<ffffffff81795ded>] system_call_fastpath+0x1a/0x1f > [23521.020639] INFO: task dd:2033 blocked for more than 120 seconds. > [23521.060586] Tainted: G OE 3.16.3-031603-generic #201409171435 > [23521.101349] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" > disables this message. > [23521.182678] dd D 000000000000000b 0 2033 23859 0x00000000 > [23521.182684] ffff88011cec78c8 0000000000000082 ffff88011cec78d8 > ffff88011cec7fd8 > [23521.182690] 00000000000143c0 00000000000143c0 ffff88048661bcc0 > ffff880113441440 > [23521.182693] ffff88011cec7898 ffff88067fd54cc0 ffff880113441440 > ffff880113441440 > [23521.182703] Call Trace: > [23521.182715] [<ffffffff81791f69>] schedule+0x29/0x70 > [23521.182718] [<ffffffff8179203f>] io_schedule+0x8f/0xd0 > [23521.182731] [<ffffffff81219e74>] dio_await_completion+0x54/0xd0 > [23521.182734] [<ffffffff8121c6a8>] do_blockdev_direct_IO+0x958/0xcc0 > [23521.182741] [<ffffffff810ba81e>] ? wake_up_bit+0x2e/0x40 > [23521.182749] [<ffffffff812aa865>] ? > jbd2_journal_dirty_metadata+0xc5/0x260 > [23521.182762] [<ffffffff81265320>] ? ext4_get_block_write+0x20/0x20 > [23521.182767] [<ffffffff8121ca5c>] __blockdev_direct_IO+0x4c/0x50 > [23521.182770] [<ffffffff81265320>] ? ext4_get_block_write+0x20/0x20 > [23521.182775] [<ffffffff8129f44e>] ext4_ind_direct_IO+0xce/0x410 > [23521.182778] [<ffffffff81265320>] ? ext4_get_block_write+0x20/0x20 > [23521.182788] [<ffffffff81261fbb>] ext4_ext_direct_IO+0x1bb/0x2a0 > [23521.182791] [<ffffffff81290158>] ? __ext4_journal_stop+0x78/0xa0 > [23521.182794] [<ffffffff812627fc>] ext4_direct_IO+0xec/0x1e0 > [23521.182803] [<ffffffff8120a003>] ? __mark_inode_dirty+0x53/0x2d0 > [23521.182808] [<ffffffff8116d39b>] > generic_file_direct_write+0xbb/0x180 > [23521.182816] [<ffffffff811ffbe2>] ? mnt_clone_write+0x12/0x30 > [23521.182819] [<ffffffff8116d707>] > __generic_file_write_iter+0x2a7/0x350 > [23521.182822] [<ffffffff8125c2b1>] ext4_file_write_iter+0x111/0x3d0 > [23521.182828] [<ffffffff81192fd4>] ? iov_iter_init+0x14/0x40 > [23521.182835] [<ffffffff811e0c8b>] new_sync_write+0x7b/0xb0 > [23521.182838] [<ffffffff811e19a7>] vfs_write+0xc7/0x1f0 > [23521.182840] [<ffffffff811e1eaf>] SyS_write+0x4f/0xb0 > [23521.182844] [<ffffffff81795ded>] system_call_fastpath+0x1a/0x1f > According to syslog, two dd processes produced hang tasks from the > total of 100 runs. Overall, the latest kernels tend to perform very > poorly with rbd mounted volumes. The single dd process with block > size of 1M or 4M was only producing on average 16MB/s, while the 4K > block size was not even getting past 900KB/s. From the client side, > rbd caching was enabled. I will now try to run the same test without > rbd caching to see if the results are different. > But from what i've seen so far, there seems to be a big performance > regression with the latest Ubuntu kernels. > Cheers > Andrei > ----- Original Message ----- > > From: "Micha Krause" <micha at krausam.de> > > > To: "Ilya Dryomov" <ilya.dryomov at inktank.com> > > > Cc: ceph-users at lists.ceph.com > > > Sent: Thursday, 25 September, 2014 9:57:19 AM > > > Subject: Re: Frequent Crashes on rbd to nfs gateway > > Server > > > Hi, > > > > > That's strange. 3.13 is way before any changes that could have > > > > had any > > > >> such effect. Can you by any chance try with older kernels to see > > >> where > > > >> it starts misbehaving for you? 3.12? 3.10? 3.8? > > > > > > > > my crush tunables are set to bobtail, so I can't go bellow 3.9, I > > > will try 3.12 tomorrow and > > > > report back. > > > Ok, I have tested 3.12.9 and it also hangs. > > > I have no other pre-build kernels to test :-(. > > > If I have to compile Kernels anyway I will test 3.16.3 as well :-/. > > > Micha Krause > > > _______________________________________________ > > > ceph-users mailing list > > > ceph-users at lists.ceph.com > > > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > > _______________________________________________ > ceph-users mailing list > ceph-users at lists.ceph.com > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.ceph.com/pipermail/ceph-users-ceph.com/attachments/20140925/8578af1e/attachment.htm>