Re: Extra write mode to close RAID5 write hole (kind of)

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 28/10/16 14:07, Vojtech Pavlik wrote:
On Fri, Oct 28, 2016 at 03:52:49AM -0800, Kent Overstreet wrote:

Indeed. However, together with the write intent bitmap, and filesystems
ensuring consistency through barriers, it's still greatly mitigated.
>
Mdraid will mark areas of disk dirty in the write intent bitmap before
writing to them. When the system comes up after a power outage, all
areas marked dirty are scanned and the xor block written where it
doesn't match the rest.

Thanks to the strict ordering using barriers, the damage to the
consistency of the RAID can only be in request since the last
successfully written barrier.

Ok so, without posting to mdraid, you are confident that, assuming the disk (etc) is correctly ordering writes, that the RAID5 write hole, as implemented by a modern Linux kernel, does not suffer from a write hole, then this is great news.

I understand that there is a clear issue in the case of a drive failure, but that's specifically why I think that bcache can be of use, because it should be able to mitigate some of this.

I have a feeling I would need to bcache the backing devices, rather than the array itself, to make this work, since, in the case of a drive failure, specifically the loss of a data-stripe as opposed to a parity one, is not possible to be ordered to avoid corruption. But I think that a bcache layer on the backing device, assuming of course that the bcache cache device is consistent, would provide this level of assurance.

The only situation where data damage can happen is a power outage that
comes together with a loss of one of the drives. In such a case, the
content of any blocks written past the last barrier is undefined. It
then depends on the filesystem whether it can revert to the last sane
state. Not sure about others, but btrfs will do so.

Yes, and of course I've mentioned this above. But... I feel that this is something that bcache could help with, and I also have several redundant backups so that, in the unlikely event of a drive failure which causes corruption, I can easily restore the files in question.

I do feel like I would like to understand a little more about how Linux mdraid behaves in this respect, but it sounds like it does a pretty good job, and that my bcache layer, and redundant backups, provide a good layer of data security.

I am mostly using this to store zbackup respositories, which store the majority of data in 256 directories, which I currently map to 16 backing devices, and could, of course, easily map to as many as 256. In this use case, with the redundant backups, and of course some automatic testing and verification of the data, I am fairly confident that I won't be losing any backups.

James
--
To unsubscribe from this list: send the line "unsubscribe linux-bcache" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [Linux ARM Kernel]     [Linux Filesystem Development]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Security]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [ECOS]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux