[linux-lvm] My explanation, correct? Re: Parallel IO on striped logic volume?

Xiaoxiang Liu <xiliu@ncsa.uiuc.edu> · Thu, 27 Sep 2001 15:56:28 -0500 (CDT)

>I build a striped LV on 4 scsi disks. If I do sequential IO on one disk
>with buffer size 4k, the bandwidth is 9MB/s. Then I do sequential IO on
>LV with buffer size 16k. Theoretically I can get almost 4*9=36MB/s
>because LVM stripe 4k IO on every disk but I only get 18MB/s. I don't
>know where I lost so much performance or it is the overhead of LVM?

Later I repeated all those tests and used different benchmarking tools.
But this time I care more about the distribution of the execution time.
Below is my test results generated by lmdd from LMBETCH tool and I use
"time" command to get the real time and system time and raw command to
implement Raw IO.
( Here LVMn means Logical Volume built on n disks with striping size 4K)

Objects      Chunk_Size Total_IO_Size Bandwidth Real_time System_time
-------      ---------- ------------- --------- --------- ----------- 
Single disk    4KB	    1GB        8.7MB/s    117.6s     37.1s
LVM2           8KB          2GB       12.7MB/s    161.2s     78.1s
LVM3          12KB          3GB       15.5MB/s    197.8s    115.6s
LVM4          16KB          4GB       17.4MB/s    235.1s    154.4s

I find only system time increases linearly and it looks like
the reason that causes the lost bandwidth of LVM3 or LVM4.
Here I have an assumption: IO_time = real_time - system_time.
If this assumption is correct, the IO times used in those four tests
are almost same and it makes sense because all tests read 1GB data from a
single disk. So theoretically we believe the time used to read nGB
data from LVMn should be same as the time used to read 1GB from a single
disk because LVM can stripe the IO to separate disks. Actually I think
the time used to read nGB from LVMn should be 
n * system_time_of_single_disk + IO_time_of_single_disk

Even the calculated results can match my test results, I still feel
confused.
1) Even CPU can only serve one disk's requests at any time, there should
   be some overlap of system time and IO time but in my results, I can't
   find the overlap.
2) Why does system time increase linearly? For LVM4, the cpu usage becomes
   65%. Isn't it too high? I think all the tests should be IO-bounded. But
   tests for LVM3 and LVM4 are cpu-bounded.

>My question is why I can't see nearly linear scaling of the bandwidth 
>when the buffer size is small? Does striping LVM do real parallel IO
>similar to software RAID0?

I can answer this question now because I tried raid0 and got almost same
performance as LVM. Now my questions become:
1) Is my assumption of IO time correct?
2) Is my explanation reasonable?
3) Why LVM3 or LVM4 used so much system time and have such high cpu usage?

Thanks!

--xiaoxiang