On 31 October 2014 19:53, Matt Garman <matthew.garman@xxxxxxxxx> wrote: > In a later post, you said you had a 4-to-1 scheme, but it wasn't clear to me > if that was 1 drive worth of data, and 4 drives worth of checksum/backup, or > the other way around. I was wondering if anybody would catch that slip. I meant 4 data to 1 parity seems about the right mix to me so far based on the my read and feel of probability of drive failure. > > In your proposed scheme, I assume you want your actual data drives to be > spinning all the time? Otherwise, when you go to read data (play > music/videos), you have the multi-second spinup delay... or is that OK with > you? Well, actually in my experience with 6-8, 2-4TB drives there is a lot of music/video content that I dont' end up playing that often. Those drives can easily be spun down (maybe for days on end and at least all night) and a small initial (one time) delay before playing a file who drive hasn't been accessed easily seems like a good trade off ( both for power and drive life ) > > Some other considerations: modern 5400 RPM drives generally consume less > than five watts in idle state[1]. Actual AC draw will be higher due to > power supply inefficiency, so we'll err on the conservative side and say > each drive requires 10 AC watts of power. My electrical rates in Chicago > are about average for the USA (11 or 12 cents/kWH), and conveniently it > roughly works out such that one always-on watt costs about $1/year. So, > each always-running hard drive will cost about $10/year to run, less with a > more efficient power supply. I know electricity is substantially more > expensive in many parts of the world; or maybe you're running off-the-grid > (e.g. solar) and have a very small power budget? Besides the cost, there is an environmental aspect. If something has superior efficiency and increases life of the product isn't it a good thing wherever we live on the planet. BTW great calculation but I moved back (to India) from San Francisco some time ago :) and the electricity cost is quite high (and availability of supply is not 100% yet). I'd like to maximize my backups and spinning disks that are not being used for hours on end sounds bad. Just to add, internet is metered per GB in many parts (and in mine sadly :( for high speed access (meaning 4-8 MBps) so I have to store content locally (before cloud suggestions are thrown around) > > On Wed, Oct 29, 2014 at 2:15 AM, Anshuman Aggarwal > <anshuman.aggarwal@xxxxxxxxx> wrote: >> >> - SnapRAID (http://snapraid.sourceforge.net/) which is a snapshot >> based scheme (Its advantages are that its in user space and has cross >> platform support but has the huge disadvantage of every checksum being >> done from scratch slowing the system, causing immense wear and tear on >> every snapshot and also losing any information updates upto the >> snapshot point etc) > > > Last time I looked at SnapRAID, it seemed like yours was its target use > case. The "huge disadvantage of every checksum being done from scratch" > sounds like a SnapRAID feature enhancement that might be > simpler/easier/faster-to-get done than a major enhancement to the Linux > kernel (just speculating though). SnapRAID can't be enhanced without involving the kernel because the delta checksum will require knowing which blocks were written to and only a kernel level driver can know that. This is a hard reality, no way around it and that was my reason to propose this. > > But, on the other hand, by your use case description, writes are very > infrequent, and you're willing to buffer checksum updates for quite a > while... so what if you had a *monthly* cron job to do parity syncs? > Schedule it for a time when the system is unlikely to be used to offset the > increased load. That's only 12 "hard" tasks for the drive per year. I'm > not an expert, but that doesn't "feel" like a lot of wear and tear. Well, again, between infrequent updates down to weekly or monthly crons sounds like a bad compromise either way when a better incremental update could store the checksum in a buffer and write them out eventually (2-3 times a day). Almost always the buffer will get written out giving us an updated parity with little to none "extra" wear and tear. > > On the issue of wear and tear, I've mostly given up trying to understand > what's best for my drives. One school of thought says many spinup-spindown > cycles are actually harder on the drive than running 24/7. But maybe > consumer drives actually aren't designed for 24/7 operation, so they're > better off being cycled up and down. Or consumer drives can't handle the > vibrations of being in a case with other 24/7 drives. But failure > to"exercise" the entire drive regularly enough might result in a situation > where an error has developed but you don't know until it's too late or your > warranty period has expired. You are right about consumer drives where spin downs are good ...with a time of an hour or so should reduce unnecessary spin up/downs. Once spun down, most may stay that way for days which is better for all of us (energy, wastage of drives etc). Spin down technology is progressing faster than block failure (also because block density is going up causing media failure and not the head failure to be the primary cause of drive outage) The drive can be tested periodically (by non destructive bad blocks etc) as a pure testing exercise to find errors being developed. There is no need to needlessly stress the drives out by reading/writing to all parts continuously. Also RAID speeds are often no longer required due to the higher R/W coming from the drives. Thanks for reading and writing such a thorough reply. Neil, would you be willing to assist/guide in helping design or with the best approach to the same? I would like to avoid the obvious pitfalls that any new kernel block level device writer is bound to face. Regards, Anshuman > > > [1] http://www.silentpcreview.com/article29-page2.html > > -- To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html