LVM performance anomalies

Jarle Bjørgeengen <jarle@bjorgeengen.net> · Sun, 11 Apr 2010 21:33:39 +0200

Hello,

I have some rather odd performance anomalies when using direct i/o  
towards lvm logical volumes.

I have a setup with 10 external scsi disks attached to the external  
scsi port of a DELL PERC 4DC RAID controller. Each disk is configured  
as its own logical disk, hence not using raid functionality in the  
adapter .

* The disks is pvcreated with "pvcreate /dev/sd[d-m]" .
* The vg is created with vgcreate vg_perc /dev/sd[d-m] .
* Logical volumes is created using  for i in b2 b3 b4 b5 ; do lvcreate  
-L 50g -n lv_${i} -I 64 -i 10 vg_perc ; done

I use fio [1] to generate workload, with this configuration:

[global]
rw=read
size=1280m
blocksize=64k
ioengine=sync
write_lat_log
write_bw_log
direct=1
[job1]
filename=/dev/mapper/vg_perc-lv_b2
[job2]
filename=/dev/mapper/vg_perc-lv_b3
[job3]
filename=/dev/mapper/vg_perc-lv_b4
[job4]
filename=/dev/mapper/vg_perc-lv_b5

Normal, and expected aggregated throughput is 35-40 MB/s . ( Approx  
8.5 MB/s each of the LVs sharing the aggragated bandwidth)

I can run this job many times after each other (I ran it continously  
for 12 Hour once) and get the expected results.

completion latency and bandwidth log plots (from fio) of a job with  
normal throughput (some anomalies in the start though) is seen here:

http://folk.uio.no/jb/strange-lvm/good-clat.png
http://folk.uio.no/jb/strange-lvm/good-bw.png

Suddenly, in the middle of one the runs, the throughput drops to  
128-256kB/s pr LV. If I stop the job and start a job reading directly  
from the underlying disks I get 39MB/s aggregated . If I  start the  
job against LVs I still get maximum 512kB/s .

completion latency and bandwidth log plots (from fio) of a job with  
poor throughput is seen here:

http://folk.uio.no/jb/strange-lvm/bad-clat.png
http://folk.uio.no/jb/strange-lvm/bad-bw.png

Summary output from fio after the bad condition is here:
http://folk.uio.no/jb/strange-lvm/bad-fio.out

Summary output from fio after the good condition is here:
http://folk.uio.no/jb/strange-lvm/good-fio.out

Complete blktraces of the lvm devices for bad condition is here:
http://folk.uio.no/jb/strange-lvm/bad-blktraces.tgz

Complete blktraces of the lvm devices for good condition is here:
http://folk.uio.no/jb/strange-lvm/good-blktraces.tgz

I have not been able to find out what i triggering the problem. It  
happens at random times. If I leave to run in poor condition.  
Sometimes it recovers again after some time.

If anyone would help me shed some light on what is going on, or  
suggest how to proceed, it would be much appreciated .

I have tried doing read through an ext3 filesystem instead of directly  
to the device, and it is the same problem.

[1] http://freshmeat.net/projects/fio/

Best rgds
Jarle Bjørgeengen

_______________________________________________
linux-lvm mailing list
linux-lvm@redhat.com
https://www.redhat.com/mailman/listinfo/linux-lvm
read the LVM HOW-TO at http://tldp.org/HOWTO/LVM-HOWTO/