Re: Btrfs High IO-Wait

Christian Brunner <chb@xxxxxx> · Tue, 11 Oct 2011 09:05:19 +0200

I think this is related to the sync issues. You could try the josef's git tree:

git://git.kernel.org/pub/scm/linux/kernel/git/josef/btrfs-work.git

Since yesterday I'm using it in our ceph cluster and it seems to do a
better job.

Regards,
Christian

2011/10/9 Martin Mailand <martin@xxxxxxxxxxxx>:
> Hi,
> I have high IO-Wait on the ods (ceph), the osd are running a v3.1-rc9
> kernel.
> I also experience high IO-rates, around 500IO/s reported via iostat.
>
> Device:         rrqm/s   wrqm/s     r/s     w/s    rkB/s    wkB/s avgrq-sz
> avgqu-sz   await r_await w_await  svctm  %util
> sda               0.00     0.00    0.00    6.80     0.00    62.40 18.35
> 0.04    5.29    0.00    5.29   5.29   3.60
> sdb               0.00   249.80 0.40 669.60     1.60  4118.40 12.30    87.47
>  130.56   15.00  130.63   1.01  67.40
>
> In comparison, the same workload, but the osd uses ext4 as a backing fs.
>
> Device:         rrqm/s   wrqm/s     r/s     w/s    rkB/s    wkB/s avgrq-sz
> avgqu-sz   await r_await w_await  svctm  %util
> sda               0.00     0.00    0.00   10.00     0.00   128.00 25.60
> 0.03    3.40    0.00    3.40   3.40   3.40
> sdb               0.00    27.80 0.00 48.20     0.00   318.40 13.21     0.43
> 8.84 0.00 8.84 1.99   9.60
>
> iodump shows similar results, where sdb is the data disk, sda7 the journal
> and sda5 the root.
>
> btrfs
>
> root@s-brick-003:~# echo 1 > /proc/sys/vm/block_dump
> root@s-brick-003:~# while true; do sleep 1; dmesg -c; done | perl
> /usr/local/bin/iodump
> ^C# Caught SIGINT.
> TASK                   PID      TOTAL       READ      WRITE      DIRTY
> DEVICES
> btrfs-submit-0        8321      28040          0      28040          0 sdb
> ceph-osd              8514        158          0        158          0 sda7
> kswapd0                 46         81          0         81          0 sda1
> bash                 10709         35         35          0          0 sda1
> flush-8:0              962         12          0         12          0 sda5
> kworker/0:1           8897          6          0          6          0 sdb
> kworker/1:1          10354          3          0          3          0 sdb
> kjournald              266          3          0          3          0 sda5
> ceph-osd              8523          2          2          0          0 sda1
> ceph-osd              8531          1          1          0          0 sda1
> dmesg                10712          1          1          0          0 sda5
>
>
> ext4
>
> root@s-brick-002:~# echo 1 > /proc/sys/vm/block_dump
> root@s-brick-002:~# while true; do sleep 1; dmesg -c; done | perl
> /usr/local/bin/iodump
> ^C# Caught SIGINT.
> TASK                   PID      TOTAL       READ      WRITE      DIRTY
> DEVICES
> ceph-osd              3115        847          0        847          0 sdb
> jbd2/sdb-8            2897        784          0        784          0 sdb
> ceph-osd              3112        728          0        728          0 sda5,
> sdb
> ceph-osd              3110        191          0        191          0 sda7
> perl                  3628         13         13          0          0 sda5
> flush-8:16            2901          8          0          8          0 sdb
> kjournald              272          3          0          3          0 sda5
> dmesg                 3630          1          1          0          0 sda5
> sleep                 3629          1          1          0          0 sda5
>
>
> I think that is the same problem as in
> http://marc.info/?l=ceph-devel&m=131158049117139&w=2
>
> I also did a latencytop as Chris recommended in the above thread.
>
> Best Regards,
>  martin
>
>
>
>
>
>
>
ÿôèº{.nÇ+?·?®??+%?Ëÿ±éÝ¶¥?wÿº{.nÇ+?·?z?ÿuëÞ?ø§¶?¡Ü¨}©?²Æ zÚ&j:+v?¨þø¯ù®w¥þ?à2?Þ?¨èÚ&¢)ß¡«a¶Úÿÿûàz¿äz¹Þ?ú+?ù???Ý¢jÿ?wèþf