On 24/05/13 08:32, keld@xxxxxxxxxx wrote: > On Thu, May 23, 2013 at 10:45:56PM -0500, Stan Hoeppner wrote: >> On 5/23/2013 3:30 AM, keld@xxxxxxxxxx wrote: >>> On Thu, May 23, 2013 at 12:59:39AM -0500, Stan Hoeppner wrote: >> >>>> You may be tempted to use md/RAID10 of some layout >>>> to optimize for writes, but you'd gain nothing, and you'd lose some >>>> performance due to overhead. The partitions you'll be using in this >>>> case are so small that they easily fit in a single physical disk track, >>>> thus no head movement is required to seek between sectors, only rotation >>>> of the platter. >> ... >>> I think a raid10,far3 is a good choice for swap, then you will enjoy >>> RAID0-like reading speed. and good write speed (compared to raid6), >>> and a chance of live surviving if just one drive keeps functioning. >> >> As I mention above, none of the md/RAID10 layouts will yield any added >> performance benefit for swap partitions. And I state the reason why. >> If you think about this for a moment you should reach the same conclusion. > > I think it is you who are not fully aquainted with Linux MD. Linux > MD RAID10,far3 offers improved performance in single read, which is an > advantage for swap, when you are swapping in. Thinkk about it and try it out for yourself. > Especially if we are talking 3 drives (far3), but also when you are > talking more drives and only 2 copies. You don't get raid0 read performance in Linux > on a combination of raid1 and raid0. > I think you are getting a number of things wrong here. For general usage, especially on a two disk system, raid10,f2 is very often an excellent choice of setup - it gives you protection (two copies of everything) and fastreads (you get striped read performance, and always from the faster outer half of the disk). You pay a higher write latency compared to plain raid1, but with typical usage figures of 5 reads per write, that's fine. And normally you don't have to wait for writes to finish anyway. But swap is different in many ways. First, the read/write ratio for swap is much closer to 1 - it can even be lower than 1. (Things like startup code for programs can get pushed to swap and never read again, as can leaked memory from buggy programs.) Secondly, write latency is a big factor - data is pushed to swap to free up memory for other usage, and that has to wait until the write is complete. Thirdly, the kernel will handle striping of multiple swap partitions automatically. And it will do it in a way that is optimal for swap usage, rather than the chunk sizes used by a striped raid system. (More often, the kernel wants parallel access to different parts of swap, rather than single large reads or writes.) One thing that seems to be slightly confused here in this thread is the mixup between the number of mirror copies and the number of drives in raid10 setups. With md raid, you can have as many mirrors as you like over as many drives as you like, though you need at least as many partitions as mirrors (and it seldom makes sense to have more mirrors than drives). For example, if you have 3 disks, you can use "far3" layout to get three copies of your data - one copy on each disk. But you can also use "far2", and get two copies of your data. See <http://en.wikipedia.org/wiki/Non-standard_RAID_levels#Linux_MD_RAID_10> for some pictures. With plain raid1, if you use 3 drives you get three copies. It seems unlikely to me that you would need the "safe against two disk failure" protection of 3-way mirrors on swap, but it is possible. Back to swap. If you don't need protection for your swap (swap should not often be in use, and a dead disk will lead to crashes on swapped-out processes but should not cause more problems than that), put a small partition on each disk, and add them all to swap. The kernel will handle striping of the swap partitions. There is nothing you can do to make it faster. When you want protection, raid1 is your best choice. Make small partitions on each disk, then pair them up as a number of raid1 pairs, and add each of these as swap. Your system will survive any disk failure, or multiple failures as long as they are from different pairs. Again, there is nothing you can do to make it faster. The important factor here is to minimise write latency. You do that by keeping the layers as simple as possible - raid1 is simpler and faster than raid10 on two disks. With small partitions, head movement and the bandwidth differences between inner and outer tracks makes no difference, so "far" layout is no benefit. Theoretically, a set of raid10,f2 pairs rather than raid1 pairs would allow faster reading of large chunks of swap - assuming, of course, that the rest of the system supports such large I/O bandwidth. But such large streaming reads do not often happen with swap - more commonly, the kernel will jump around in its accesses. Large reads that use all spindles are good for the throughput for large streamed reads, but they also block all disks and increase the latency for random accesses which are the common case for swap. I'm a great fan of raid10,f2 - I think it is an optimal choice for many uses, and shows a power and flexibility of Linux's md system that is well above what you can get with hardware raid (or software raid on other OS's). But for swap, you want raid1. -- To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html