On Wed, Apr 8, 2015 at 11:15 PM, Shaohua Li <shli@xxxxxx> wrote: > On Thu, Apr 09, 2015 at 03:04:59PM +1000, NeilBrown wrote: >> On Wed, 8 Apr 2015 17:43:11 -0700 Shaohua Li <shli@xxxxxx> wrote: >> >> > Hi, >> > This is what I'm working on now, and hopefully had the basic code >> > running next week. The new design will do cache and fix the write hole >> > issue too. Before I post the code out, I'd like to check if the design >> > has obvious issues. >> >> I can't say I'm excited about it.... >> >> You still haven't explained why you would ever want to read data from the >> "cache"? Why not just keep everything in the stripe-cache until it is safe >> in the RAID. I asked before and you said: >> >> >> I'm not enthusiastic to use stripe cache though, we can't keep all data >> >> in stripe cache. What we really need is an index. >> >> which is hardly an answer. Why cannot you keep all the data in the stripe >> cache? How much data is there? How much memory can you afford to dedicate? >> >> You must have some very long sustained bursts of writes which are much faster >> than the RAID can accept in order to not be able to keep everything in memory. >> >> >> Your cache layout seems very rigid. I would much rather a layout that was >> very general and flexible. If you want to always allocate a chunk at a time >> then fine, but don't force that on the cache layout. >> >> The log really should be very simple. A block describing what comes next, >> then lots of data/parity. Then another block and more data etc etc. >> Each metadata block points to the next one. >> If you need an index of the cache, you keep that in memory. On restart, you >> read all of the metadata blocks and built up the index. >> >> I think that space in the log should be reclaimed in exactly the order that >> it is written, so the active part of the log is contiguous. Obviously >> individual blocks become inactive in arbitrary order as they are written to >> the RAID, but each extent of the log becomes free in order. >> If you want that to happen out of order, you would need to present a very >> good reason. > > I came to the same idea when I'm thinking about a caching layer, but the > memory size is the main blocking issue. If the solution requires a large > amount of extra memory, it's not cost effective, so a hard sell to > replace hardware raid with software raid. The design completely depends > on if we can store all data in memory. I don't have an anwser yet how > much memory we should use to make the aggregation efficient. Guess only > number can talk. I'll try to collect some data and get back to you. > Another consideration to keep in mind is persistent memory. I'm working on an in-kernel mechanism to claim and map pmem and a raid-write-cache is an obvious first application. I'll include you on the initial submission of that capability. -- To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html