hank peng wrote:
I am new to this area, so I'm not quite familiar with some words what
you mentioned.
The machine has a SATA controller (chip is Marvell 6081) attached on
PCI-X bus. Five SATA II disks are attached to it.
Each disk has 500G space.
The following is my procedure:
#mdadm -C /dev/md0 -l5 -n5 /dev/sd{a,b,c,d,e}
After recovery is done, I do this:
#pvcreate /dev/md0
#vgcreate myvg /dev/md0
#lvcreate -n mylv -L 1000G myvg
#mkfs.xfs /dev/myvg/mylv or #mkfs.reiserfs /dev/myvg/mylv
mount this file system and begin to use it.
I mainly want to optimise its sequential write performace, IOPs is not
my concern.
When you create PV, it consists of usually initial 192KiB area with
metadata (it can be controlled by --metadatasize option of pvcreate).
Then extents follow (4MiB by default). As far as alignment is
considered, the best case scenario is when extents are aligned with
raid's stripe. It's possible in your case, but not generally - as
extents must be power of 2.
Try:
pvcreate --metadatasize 250K /dev/md0 (250K will be rounded up properly)
...and verify
pvs /dev/md0 -o+pe_start
...you should get 256.00K under 1st PE. 256K, as it's your stripe's size
(md defaults to 64KiB chunk, and you haven't altered it).
Most filesystems allow setting stripe and chunk parameters - ext{2,3,4},
xfs - to name a few. They are used to e.g. setup their structures more
optimally, and avoid read-modify-write if at all possible. I don't know
if reiser has such settings, but xfs certailny does (look for su/sw
options of mkfs.xfs).
Note, that when you create filesystems on logical volumes, they will not
detect under-the-lvm raid structure - you have to set that manually. If
extents are not aligned, then any settings related to stripe will be
meaningless (as filesystem assumes it starts are stripe boundary
itself). Chunk size will [/might] still be useful though.
Another easily forgotten parameter is LV's readahead. If not set
explicitly, it will default to 256, which is quite small value. You can
change it with blockdev or lvchange (permanently with the latter). RA
set on md0 directly doesn't matter afaik, unless you also plan to setup
filesystems directly on it.
Check out /sys/class/block/md0/md/stripe_cache_size (or /sys/block/.. if
you use old sysfs layout) and increase it, if you have memory to spare.
Increasing RA and stripe_cache_size can provide very significant boost.
Forgetting about the former is often a cause for complains about lvm
performance (when compared to md used directly).
There's definitely more to it (like specific filesystem's creation and
mount options, or more basically - filesystem choice). Best wait for
David's input.
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html