Re: "Missing" RAID devices

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 5/22/2013 6:26 PM, Phil Turmel wrote:
> On 05/22/2013 06:43 PM, Stan Hoeppner wrote:
>> Sorry for the dup Phil, hit the wrong reply button.
> 
> No worries.
> 
>> On 5/21/2013 7:02 PM, Phil Turmel wrote:
>> ...
>>> ...First is /dev/md1, a small (~500m) n-way
>>> ...as /boot.  The other, /dev/md2, uses
>>> ...raid10,far3 or raid6.
>>>
>>> I put LVM on top of /dev/md2, with LVs for swap, ... /tmp
>>
>> Swap and tmp atop an LV atop RAID6?  The former will always RMW on page
>> writes, the latter quite often will cause RMW.  As you stated your
>> performance requirements are modest.  However, for the archives, putting
>> swap on a parity array, let alone a double parity array, is not good
>> practice.
> 
> Ah, good point.  Hasn't hurt me yet, but it would if I pushed anything
> hard.  I'll have to revise my baseline to always have a small raid10,f3
> to go with the raid6.

Yeah, the kicker here is that swap on a parity array seems to work fine,
right up until the moment it doesn't.  And that's when the kernel goes
into heavy swapping due to any number of causes.  When that happens,
you're heavily into RMW, disk heads are bang'n, latency goes through the
roof.  If any programs are trying to access files on the parity array,
say a mildly busy IMAP, FTP, etc, server, everything grinds to a halt.

With your particular setup, instead you might use n additional
partitions, one each across the physical disks that comprise your n-way
RAID1.  You would configure the partition type of each as (82) Linux
swap, and add them all to fstab with equal priority.  The kernel will
interleave the 4KB swap page writes evenly across all of these
partitions, yielding swap performance similar to an n-way RAID0 stripe.

The downside to this setup is the kernel probably crashes if you lose
one of these disks and thus the swap partition on it.  So you could
simply make another md/RAID1 of these n partitions if n is an odd number
of spindles.  Or n/2 RAID1 arrays if n is even.  Then put one swap
partition on each RAID1 device and do swap interleaving across the RAID1
pairs as described above in the non RAID case.

The reason for this last configuration is simple-- more swap throughput
for the same number of physical writes.  With a 4 drive RAID1 and a
single swap partition atop, each 4KB page write to swap generates a 4KB
write to each of the 4 disks, 16KB total.  If you create two RAID1s and
put a swap partition on each and interleave them, each 4KB page write to
swap generates only two 4KB writes, 8KB total.  Here for each 16KB
written you're moving two pages to swap instead of one.  Thus your swap
bandwidth is doubled.  But you still have redundancy and crash avoidance
if one disk fails.  You may be tempted to use md/RAID10 of some layout
to optimize for writes, but you'd gain nothing, and you'd lose some
performance due to overhead.  The partitions you'll be using in this
case are so small that they easily fit in a single physical disk track,
thus no head movement is required to seek between sectors, only rotation
of the platter.

Another advantage to this hybrid approach is less disk space consumed.
If you need 8GB of swap, a 4-way RAID1 swap partition requires 32GB of
disk space, 8GB per disk.  With the n/2 RAID1 approach and 4 disks it
requires half that, 16GB.  With the no redundancy interleaved approach
it requires 1/4th, only 2GB per disk, 8GB total.  With today's
mechanical disk capacities this isn't a concern.  But if using SSDs it
can be.

> Meanwhile, I'm applying some of the general ideas I've seen from you:
> I've acquired a pair of Crucial M4 SSDs for my new home media server to
> keep small files and databases away from the bulk storage.  Not in
> service yet, but I'm very pleased so far.

If the two are competing for seeks thus slowing everything down, moving
the random access stuff to SSD should help.

> I'm pretty sure the new kit is way overkill for a media server... :-)

Not so many years ago folks would have said the same about 4TB mech
drives. ;)

-- 
Stan

--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html




[Index of Archives]     [Linux RAID Wiki]     [ATA RAID]     [Linux SCSI Target Infrastructure]     [Linux Block]     [Linux IDE]     [Linux SCSI]     [Linux Hams]     [Device Mapper]     [Device Mapper Cryptographics]     [Kernel]     [Linux Admin]     [Linux Net]     [GFS]     [RPM]     [git]     [Yosemite Forum]


  Powered by Linux