Re: Journal-guided Resynchronization for Software RAID

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wednesday November 30, tedenehy@xxxxxxxxxxx wrote:
> 
> I will be presenting a paper at the upcoming USENIX FAST conference
> about using the ext3 file system journal to guide the software RAID
> resynchronization process.  I was wondering if you had any opinions
> as to the viability of this approach in Linux.
> 
> Here is a link to the paper:
> 
>     http://www.cs.wisc.edu/adsl/Publications/fast05-journal-guided.pdf
> 

Firstly, a couple of minor points:
   'multimillion dollar price tag' -- maybe a little bit of an
    exaggeration.

    bitmap intent logging for raid1 is now in the mainline kernel (2.6.14)
    and will be for raid4/5/6/10 in 2.6.16. (Your paper says it isn't
    yet, but gives no date or release to give context to your
    statement).


It would be really great if you could do your same tests with
bitmap-based intent logging and see how much is slows writes down.  I
suspect it would be more than with your declared mode, but definite
figures would be great.  (Your comparison on code size is certainly
interesting!).

I agree that closer communication between the filesystem and the
storage system is import to improve raid performance and reliability.
Your 'verified read' sounds like a very appropriate part of that.
There is an awkwardness when a raid array is partitioned as then you
might want the raid system to resync some parts, but leave the
filesystem to resync other parts (And a partition used for swap would
never need syncing at all).  Maybe some very course variety of the
bitmap intent log might be useful here (ext3 with declared mode would
tell raid never to set intent bits..).


Adding full journalling to md is something I have considered from time
to time.  It would need NVRAM to be accepted, and at a couple of
thousand for such a board, it isn't common enough for me to justify
the effort....
What I would really like is a cheap (Well, not too expensive) board
that had at least 100Meg of NVRAM which was addressable on the PCI
buss, and an XOR and RAID-6 engine connected to the DMA engine.  Then
we could use the NVRAM as a write-behind cache and offload all the
parity calculation to it, while still having all the flexibility of
software raid...

I'd probably be happy to consider the 'verified read' enhancements to
md for inclusion in mainline.

Good work!

NeilBrown
-
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux RAID Wiki]     [ATA RAID]     [Linux SCSI Target Infrastructure]     [Linux Block]     [Linux IDE]     [Linux SCSI]     [Linux Hams]     [Device Mapper]     [Device Mapper Cryptographics]     [Kernel]     [Linux Admin]     [Linux Net]     [GFS]     [RPM]     [git]     [Yosemite Forum]


  Powered by Linux