Hi all: I am adding pgiosim to our testing for new database hardware and I am seeing something I don't quite get and I think it's because I am using pgiosim incorrectly. Specs: OS: centos 5.5 kernel: 2.6.18-194.32.1.el5 memory: 96GB cpu: 2x Intel(R) Xeon(R) X5690 @ 3.47GHz (6 core, ht enabled) disks: WD2003FYYS RE4 raid: lsi - 9260-4i with 8 disks in raid 10 configuration 1MB stripe size raid cache enabled w/ bbu disk caches disabled filesystem: ext3 created with -E stride=256 I am seeing really poor (70) iops with pgiosim. According to: http://www.tomshardware.com/reviews/2tb-hdd-7200,2430-8.html in the database benchmark they are seeing ~170 iops on a single disk for these drives. I would expect an 8 disk raid 10 should get better then 3x the single disk rate (assuming the data is randomly distributed). To test I am using 5 100GB files with sudo ~/pgiosim -c -b 100G -v file? I am using 100G sizes to make sure that the data read and files sizes exceed the memory size of the system. However if I use 5 1GB files (and still 100GB read data) I see 200+ to 400+ iops at 50% of the 100GB of data read, which I assume means that the data is cached in the OS cache and I am not really getting hard drive/raid I/O measurement of iops. However, IIUC postgres will never have an index file greater than 1GB in size (http://www.postgresql.org/docs/8.4/static/storage-file-layout.html) and will just add 1GB segments, so the 1GB size files seems to be more realistic. So do I want 100 (or probably 2 or 3 times more say 300) 1GB files to feed pgiosim? That way I will have enough data that not all of it can be cached in memory and the file sizes (and file operations: open/close) more closely match what postgres is doing with index files? Also in the output of pgiosim I see: 25.17%, 2881 read, 0 written, 2304.56kB/sec 288.07 iops which I interpret (left to right) as the % of the 100GB that has been read, the number of read operations over some time period, number of bytes read/written and the io operations/sec. Iops always seems to be 1/10th of the read number (rounded up to an integer). Is this expected and if so anybody know why? While this is running if I also run "iostat -p /dev/sdc 5" I see: Device: tps Blk_read/s Blk_wrtn/s Blk_read Blk_wrtn sdc 166.40 2652.80 4.80 13264 24 sdc1 2818.80 1.20 999.20 6 4996 which I am interpreting as 2818 read/io operations (corresponding more or less to read in the pgiosim output) to the partition and of those only 116 are actually going to the drive??? with the rest handled from OS cache. However the tps isn't increasing when I see pgiosim reporting: 48.47%, 4610 read, 0 written, 3687.62kB/sec 460.95 iops an iostat 5 output near the same time is reporting: Device: tps Blk_read/s Blk_wrtn/s Blk_read Blk_wrtn sdc 165.87 2647.50 4.79 13264 24 sdc1 2812.97 0.60 995.41 3 4987 so I am not sure if there is a correlation between the read and tps settings. Also I am assuming blks written is filesystem metadata although that seems like a lot of data If I stop the pgiosim, the iostat drops to 0 write and reads as expected. So does anybody have any comments on how to test with pgiosim and how to correlate the iostat and pgiosim outputs? Thanks for your feedback. -- -- rouilj John Rouillard System Administrator Renesys Corporation 603-244-9084 (cell) 603-643-9300 x 111 -- Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance