Re: [PATCH v2 00/12] Partial Parity Log for MD RAID 5

Jes Sorensen <Jes.Sorensen@xxxxxxxxxx> · Tue, 03 Jan 2017 10:42:05 -0500

Shaohua Li <shli@xxxxxxxxxx> writes:
> On Thu, Dec 15, 2016 at 12:44:57PM +0100, Artur Paszkiewicz wrote:
>> On 12/14/2016 08:47 PM, Shaohua Li wrote:
>> > For the implementation, I don't understand how the ppl works much,
>> > there aren't
>> > many details there. Two things I noted:
>> > 
>> > - The code skips the log for full stripe write. This isn't
>> > good. It would means
>> >   after a unclean shutdown/recovery, one disk has arbitrary data,
>> > not the old
>> >   data and new data. This breaks an assumption in filesystem, after a failed
>> >   write to a sector, the sector has either old or new data. Thinking about a
>> >   write to superblock. The data could be old or new superblock,
>> > but it's still a
>> >   superblock, not something random.
>> > 
>> > - From the patch 6 & 10, looks PPL only help recover unwritten disks. If one
>> >   disk of a stripe is dirty (eg it's written before unclean
>> > shutdown), and it's
>> >   lost in recovery, what will happen? Seems the data of lost disk
>> > will be read as
>> >   0? It will break the assumption above too. If I understand the
>> > code clearly
>> >   (maybe not, need clarification), this is a design flaw.
>> 
>> PPL is only used to update the parity for a stripe, data chunks are not
>> modified at all during PPL recovery. The assumption was that it would
>> protect only from silent data corruption, to eliminate the cases when
>> data that was not touched by a write request could change. So if a dirty
>> disk is lost, no recovery is performed for this stripe (parity is not
>> updated). For full stripe write we only recalculate the parity after a
>> dirty shutdown if all disks are available (like resync). So you are
>> right that it is still possible to have arbitrary data in the written
>> part of a stripe if that disk is lost. In such case the behavior is the
>> same as in plain raid5.
>
> Ok, this matches my understanding. This isn't a completed solution but does
> help a lot. If users want to use this, there is no reason to not support it.
> After you fix the alignment issue and describe the solution in details, I'll
> look at it again.

Artur,

Did you make any progress getting the alignment issue resolved?

I'd really like to get an mdadm release out the door this week, so
getting this resolved would be awesome. Hint hint ;)

Cheers,
Jes
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html