Re: help tuning queries on large database

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Jan 8, 2006, at 1:42 PM, Luke Lonergan wrote:
Have you tested the underlying filesystem for it's performance? Run this:
  time bash -c 'dd if=/dev/zero of=/my_file_system/bigfile bs=8k
count=<your_memory_size_in_GB * 250000> && sync'

This is a 2-disk RAID0

[root@bigboy /opt/alt-2]# time bash -c 'dd if=/dev/zero of=/opt/alt-2/ bigfile bs=8k count=1000000 && sync'
1000000+0 records in
1000000+0 records out

real    1m27.143s
user    0m0.276s
sys     0m37.338s

'iostat -x' showed writes peaking at ~100MB/s


Then run this:
  time dd if=/my_file_system/bigfile bs=8k of=/dev/null

[root@bigboy /opt/alt-2]# time dd if=/opt/alt-2/bigfile bs=8k of=/dev/ null
1000000+0 records in
1000000+0 records out

real    1m9.846s
user    0m0.189s
sys     0m11.099s

'iostat -x' showed reads peaking at ~116MB/s


Again with kernel 2.6.15:

[root@bigboy ~]# time bash -c 'dd if=/dev/zero of=/opt/alt-2/bigfile bs=8k count=1000000 && sync'
1000000+0 records in
1000000+0 records out

real    1m29.144s
user    0m0.204s
sys     0m48.415s

[root@bigboy ~]# time dd if=/opt/alt-2/bigfile bs=8k of=/dev/null
1000000+0 records in
1000000+0 records out

real    1m9.701s
user    0m0.168s
sys     0m11.933s


And report the times here please. With your 8 disks in any of the RAID0 configurations you describe, you should be getting 480MB/s. In the RAID10
configuration you should get 240.

Not anywhere near that. I'm scouring the 'net looking to see what needs to be tuned at the HW level.

You should also experiment with using larger readahead, which you can
implement like this:
  blockdev --setra 16384 /dev/<my_block_device>

E.g. "blockdev --setra 16384 /dev/sda"

wow, this helped nicely. Without using the updated kernel, it took 28% off my testcase time.

From what you describe, one of these is likely:
- hardware isn't configured properly or a driver problem.

Using the latest Areca driver, looking to see if there is some configuration that was missed.

- you need to use xfs and tune your Linux readahead

Will try XFS soon, concentrating on the 'dd' speed issue first.


On Jan 8, 2006, at 4:35 PM, Ron wrote:
Areca ARC-1220 8-port PCI-E controller

Make sure you have 1GB or 2GB of cache. Get the battery backup and set the cache for write back rather than write through.

The card we've got doesn't have a SODIMM socket, since its only an 8- port card. My understanding was that was cache used when writing?

A 2.6.12 or later based Linux distro should have NO problems using more than 4GB or RAM.

Upgraded the kernel to 2.6.15, then we were able to set the BIOS option for the 'Memory Hole' to 'Software' and it saw all 4G (under 2.6.11 we got a kernel panic with that set)

RAID Layout:

4 2-disk RAID0 sets created
You do know that a RAID 0 set provides _worse_ data protection than a single HD? Don't use RAID 0 for any data you want kept reliably.

yup, aware of that. was planning on RAID10 for production, but just broke it out into RAID0 sets for testing (from what I read, I gathered that the read performance of RAID0 and RAID10 were comparable)


thanks for all the suggestions, I'll report back as I continue testing.

-pete

--
(peter.royal|osi)@pobox.com - http://fotap.org/~osi

Attachment: smime.p7s
Description: S/MIME cryptographic signature


[Postgresql General]     [Postgresql PHP]     [PHP Users]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Books]     [PHP Databases]     [Yosemite]

  Powered by Linux