On Mon, Jul 13, 2009 at 08:12:14PM +0800, Ronald Moesbergen wrote: > 2009/7/10 Vladislav Bolkhovitin <vst@xxxxxxxx>: > > > > Vladislav Bolkhovitin, on 07/10/2009 12:43 PM wrote: > >> > >> Ronald Moesbergen, on 07/10/2009 10:32 AM wrote: > >>>> > >>>> I've also long ago noticed that reading data from block devices is > >>>> slower > >>>> than from files from mounted on those block devices file systems. Can > >>>> anybody explain it? > >>>> > >>>> Looks like this is strangeness #2 which we uncovered in our tests (the > >>>> first > >>>> one was earlier in this thread why the context RA doesn't work with > >>>> cooperative I/O threads as good as it should). > >>>> > >>>> Can you rerun the same 11 tests over a file on the file system, please? > >>> > >>> I'll see what I can do. Just te be sure: you want me to run > >>> blockdev-perftest on a file on the OCFS2 filesystem which is mounted > >>> on the client over iScsi, right? > >> > >> Yes, please. > > > > Forgot to mention that you should also configure your backend storage as a > > big file on a file system (preferably, XFS) too, not as direct device, like > > /dev/vg/db-master. > > Ok, here are the results: Ronald, thanks for the numbers! > client kernel: 2.6.26-15lenny3 (debian) > server kernel: 2.6.29.5 with readahead patch Do you mean the context readahead patch? > Test done with XFS on both the target and the initiator. This confirms > your findings, using files instead of block devices is faster, but > only when using the io_context patch. It shows that the one really matters is the io_context patch, even when context readahead is running. I guess what happened in the tests are: - without readahead (or readahead algorithm failed to do proper sequential readaheads), the SCST processes will be submitting small but close to each other IOs. CFQ relies on the io_context patch to prevent unnecessary idling. - with proper readahead, the SCST processes will also be submitting close readahead IOs. For example, one file's 100-102MB pages is readahead by process A, while its 102-104MB pages may be readahead by process B. In this case CFQ will also idle waiting for process A to submit the next IO, but in fact that IO is being submitted by process B. So the io_context patch is still necessary even when context readahead is working fine. I guess context readahead do have the added value of possibly enlarging the IO size (however this benchmark seems to not very sensitive to IO size). Thanks, Fengguang > Without io_context patch: > 1) client: default, server: default > blocksize R R R R(avg, R(std R > (bytes) (s) (s) (s) MB/s) ,MB/s) (IOPS) > 67108864 18.327 18.327 17.740 56.491 0.872 0.883 > 33554432 18.662 18.311 18.116 55.772 0.683 1.743 > 16777216 18.900 18.421 18.312 55.229 0.754 3.452 > 8388608 18.893 18.533 18.281 55.156 0.743 6.895 > 4194304 18.512 18.097 18.400 55.850 0.536 13.963 > 2097152 18.635 18.313 18.676 55.232 0.486 27.616 > 1048576 18.441 18.264 18.245 55.907 0.267 55.907 > 524288 17.773 18.669 18.459 55.980 1.184 111.960 > 262144 18.580 18.758 17.483 56.091 1.767 224.365 > 131072 17.224 18.333 18.765 56.626 2.067 453.006 > 65536 18.082 19.223 18.238 55.348 1.483 885.567 > 32768 17.719 18.293 18.198 56.680 0.795 1813.766 > 16384 17.872 18.322 17.537 57.192 1.024 3660.273 > > 2) client: default, server: 64 max_sectors_kb, RA default > blocksize R R R R(avg, R(std R > (bytes) (s) (s) (s) MB/s) ,MB/s) (IOPS) > 67108864 18.738 18.435 18.400 55.283 0.451 0.864 > 33554432 18.046 18.167 17.572 57.128 0.826 1.785 > 16777216 18.504 18.203 18.377 55.771 0.376 3.486 > 8388608 22.069 18.554 17.825 53.013 4.766 6.627 > 4194304 19.211 18.136 18.083 55.465 1.529 13.866 > 2097152 18.647 17.851 18.511 55.866 1.071 27.933 > 1048576 19.084 18.177 18.194 55.425 1.249 55.425 > 524288 18.999 18.553 18.380 54.934 0.763 109.868 > 262144 18.867 18.273 18.063 55.668 1.020 222.673 > 131072 17.846 18.966 18.193 55.885 1.412 447.081 > 65536 18.195 18.616 18.482 55.564 0.530 889.023 > 32768 17.882 18.841 17.707 56.481 1.525 1807.394 > 16384 17.073 18.278 17.985 57.646 1.689 3689.369 > > 3) client: default, server: default max_sectors_kb, RA 2MB > blocksize R R R R(avg, R(std R > (bytes) (s) (s) (s) MB/s) ,MB/s) (IOPS) > 67108864 18.658 17.830 19.258 55.162 1.750 0.862 > 33554432 17.193 18.265 18.517 56.974 1.854 1.780 > 16777216 17.531 17.681 18.776 56.955 1.720 3.560 > 8388608 18.234 17.547 18.201 56.926 1.014 7.116 > 4194304 18.057 17.923 17.901 57.015 0.218 14.254 > 2097152 18.565 17.739 17.658 56.958 1.277 28.479 > 1048576 18.393 17.433 17.314 57.851 1.550 57.851 > 524288 18.939 17.835 18.972 55.152 1.600 110.304 > 262144 18.562 19.005 18.069 55.240 1.141 220.959 > 131072 19.574 17.562 18.251 55.576 2.476 444.611 > 65536 19.117 18.019 17.886 55.882 1.647 894.115 > 32768 18.237 17.415 17.482 57.842 1.200 1850.933 > 16384 17.760 18.444 18.055 56.631 0.876 3624.391 > > 4) client: default, server: 64 max_sectors_kb, RA 2MB > blocksize R R R R(avg, R(std R > (bytes) (s) (s) (s) MB/s) ,MB/s) (IOPS) > 67108864 18.368 17.495 18.524 56.520 1.434 0.883 > 33554432 18.209 17.523 19.146 56.052 2.027 1.752 > 16777216 18.765 18.053 18.550 55.497 0.903 3.469 > 8388608 17.878 17.848 18.389 56.778 0.774 7.097 > 4194304 18.058 17.683 18.567 56.589 1.129 14.147 > 2097152 18.896 18.384 18.697 54.888 0.623 27.444 > 1048576 18.505 17.769 17.804 56.826 1.055 56.826 > 524288 18.319 17.689 17.941 56.955 0.816 113.910 > 262144 19.227 17.770 18.212 55.704 1.821 222.815 > 131072 18.738 18.227 17.869 56.044 1.090 448.354 > 65536 19.319 18.525 18.084 54.969 1.494 879.504 > 32768 18.321 17.672 17.870 57.047 0.856 1825.495 > 16384 18.249 17.495 18.146 57.025 1.073 3649.582 > > With io_context patch: > 5) client: default, server: default > blocksize R R R R(avg, R(std R > (bytes) (s) (s) (s) MB/s) ,MB/s) (IOPS) > 67108864 12.393 11.925 12.627 83.196 1.989 1.300 > 33554432 11.844 11.855 12.191 85.610 1.142 2.675 > 16777216 12.729 12.602 12.068 82.187 1.913 5.137 > 8388608 12.245 12.060 14.081 80.419 5.469 10.052 > 4194304 13.224 11.866 12.110 82.763 3.833 20.691 > 2097152 11.585 12.584 11.755 85.623 3.052 42.811 > 1048576 12.166 12.144 12.321 83.867 0.539 83.867 > 524288 12.019 12.148 12.160 84.568 0.448 169.137 > 262144 12.014 12.378 12.074 84.259 1.095 337.036 > 131072 11.840 12.068 11.849 85.921 0.756 687.369 > 65536 12.098 11.803 12.312 84.857 1.470 1357.720 > 32768 11.852 12.635 11.887 84.529 2.465 2704.931 > 16384 12.443 13.110 11.881 82.197 3.299 5260.620 > > 6) client: default, server: 64 max_sectors_kb, RA default > blocksize R R R R(avg, R(std R > (bytes) (s) (s) (s) MB/s) ,MB/s) (IOPS) > 67108864 13.033 12.122 11.950 82.911 3.110 1.295 > 33554432 12.386 13.357 12.082 81.364 3.429 2.543 > 16777216 12.102 11.542 12.053 86.096 1.860 5.381 > 8388608 12.240 11.740 11.789 85.917 1.601 10.740 > 4194304 11.824 12.388 12.042 84.768 1.621 21.192 > 2097152 11.962 12.283 11.973 84.832 1.036 42.416 > 1048576 12.639 11.863 12.010 84.197 2.290 84.197 > 524288 11.809 12.919 11.853 84.121 3.439 168.243 > 262144 12.105 12.649 12.779 81.894 1.940 327.577 > 131072 12.441 12.769 12.713 81.017 0.923 648.137 > 65536 12.490 13.308 12.440 80.414 2.457 1286.630 > 32768 13.235 11.917 12.300 82.184 3.576 2629.883 > 16384 12.335 12.394 12.201 83.187 0.549 5323.990 > > 7) client: default, server: default max_sectors_kb, RA 2MB > blocksize R R R R(avg, R(std R > (bytes) (s) (s) (s) MB/s) ,MB/s) (IOPS) > 67108864 12.017 12.334 12.151 84.168 0.897 1.315 > 33554432 12.265 12.200 11.976 84.310 0.864 2.635 > 16777216 12.356 11.972 12.292 83.903 1.165 5.244 > 8388608 12.247 12.368 11.769 84.472 1.825 10.559 > 4194304 11.888 11.974 12.144 85.325 0.754 21.331 > 2097152 12.433 10.938 11.669 87.911 4.595 43.956 > 1048576 11.748 12.271 12.498 84.180 2.196 84.180 > 524288 11.726 11.681 12.322 86.031 2.075 172.062 > 262144 12.593 12.263 11.939 83.530 1.817 334.119 > 131072 11.874 12.265 12.441 84.012 1.648 672.093 > 65536 12.119 11.848 12.037 85.330 0.809 1365.277 > 32768 12.549 12.080 12.008 83.882 1.625 2684.238 > 16384 12.369 12.087 12.589 82.949 1.385 5308.766 > > 8) client: default, server: 64 max_sectors_kb, RA 2MB > blocksize R R R R(avg, R(std R > (bytes) (s) (s) (s) MB/s) ,MB/s) (IOPS) > 67108864 12.664 11.793 11.963 84.428 2.575 1.319 > 33554432 11.825 12.074 12.442 84.571 1.761 2.643 > 16777216 11.997 11.952 10.905 88.311 3.958 5.519 > 8388608 11.866 12.270 11.796 85.519 1.476 10.690 > 4194304 11.754 12.095 12.539 84.483 2.230 21.121 > 2097152 11.948 11.633 11.886 86.628 1.007 43.314 > 1048576 12.029 12.519 11.701 84.811 2.345 84.811 > 524288 11.928 12.011 12.049 85.363 0.361 170.726 > 262144 12.559 11.827 11.729 85.140 2.566 340.558 > 131072 12.015 12.356 11.587 85.494 2.253 683.952 > 65536 11.741 12.113 11.931 85.861 1.093 1373.770 > 32768 12.655 11.738 12.237 83.945 2.589 2686.246 > 16384 11.928 12.423 11.875 84.834 1.711 5429.381 > > 9) client: 64 max_sectors_kb, default RA. server: 64 max_sectors_kb, RA 2MB > blocksize R R R R(avg, R(std R > (bytes) (s) (s) (s) MB/s) ,MB/s) (IOPS) > 67108864 13.570 13.491 14.299 74.326 1.927 1.161 > 33554432 13.238 13.198 13.255 77.398 0.142 2.419 > 16777216 13.851 13.199 13.463 75.857 1.497 4.741 > 8388608 13.339 16.695 13.551 71.223 7.010 8.903 > 4194304 13.689 13.173 14.258 74.787 2.415 18.697 > 2097152 13.518 13.543 13.894 75.021 0.934 37.510 > 1048576 14.119 14.030 13.820 73.202 0.659 73.202 > 524288 13.747 14.781 13.820 72.621 2.369 145.243 > 262144 14.168 13.652 14.165 73.189 1.284 292.757 > 131072 14.112 13.868 14.213 72.817 0.753 582.535 > 65536 14.604 13.762 13.725 73.045 2.071 1168.728 > 32768 14.796 15.356 14.486 68.861 1.653 2203.564 > 16384 13.079 13.525 13.427 76.757 1.111 4912.426 > > 10) client: default max_sectors_kb, 2MB RA. server: 64 max_sectors_kb, RA 2MB > blocksize R R R R(avg, R(std R > (bytes) (s) (s) (s) MB/s) ,MB/s) (IOPS) > 67108864 20.372 18.077 17.262 55.411 3.800 0.866 > 33554432 17.287 17.620 17.828 58.263 0.740 1.821 > 16777216 16.802 18.154 17.315 58.831 1.865 3.677 > 8388608 17.510 18.291 17.253 57.939 1.427 7.242 > 4194304 17.059 17.706 17.352 58.958 0.897 14.740 > 2097152 17.252 18.064 17.615 58.059 1.090 29.029 > 1048576 17.082 17.373 17.688 58.927 0.838 58.927 > 524288 17.129 17.271 17.583 59.103 0.644 118.206 > 262144 17.411 17.695 18.048 57.808 0.848 231.231 > 131072 17.937 17.704 18.681 56.581 1.285 452.649 > 65536 17.927 17.465 17.907 57.646 0.698 922.338 > 32768 18.494 17.820 17.719 56.875 1.073 1819.985 > 16384 18.800 17.759 17.575 56.798 1.666 3635.058 > > 11) client: 64 max_sectors_kb, 2MB. RA server: 64 max_sectors_kb, RA 2MB > blocksize R R R R(avg, R(std R > (bytes) (s) (s) (s) MB/s) ,MB/s) (IOPS) > 67108864 20.045 21.881 20.018 49.680 2.037 0.776 > 33554432 20.768 20.291 20.464 49.938 0.479 1.561 > 16777216 21.563 20.714 20.429 49.017 1.116 3.064 > 8388608 21.290 21.109 21.308 48.221 0.205 6.028 > 4194304 22.240 20.662 21.088 48.054 1.479 12.013 > 2097152 20.282 21.098 20.580 49.593 0.806 24.796 > 1048576 20.367 19.929 20.252 50.741 0.469 50.741 > 524288 20.885 21.203 20.684 48.945 0.498 97.890 > 262144 19.982 21.375 20.798 49.463 1.373 197.853 > 131072 20.744 21.590 19.698 49.593 1.866 396.740 > 65536 21.586 20.953 21.055 48.314 0.627 773.024 > 32768 21.228 20.307 21.049 49.104 0.950 1571.327 > 16384 21.257 21.209 21.150 48.289 0.100 3090.498 > > Ronald. -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html