Re: [PATCH] raid5: add support for rmw writes in raid6

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




On 4/30/13 9:01 AM, "Kumar Sundararajan" <kumar@xxxxxx> wrote:

>
>
>On 4/29/13 11:48 PM, "David Brown" <david.brown@xxxxxxxxxxxx> wrote:
>
>>On 30/04/13 02:05, Dan Williams wrote:
>>> On Mon, Apr 29, 2013 at 12:28 PM, David Brown
>>><david.brown@xxxxxxxxxxxx> wrote:
>>>> For each data block you are changing, you will need to remove the old
>>>>g^i *
>>>> Di_old then add in the new g^i * Di_new, so you can still use this
>>>> simplification to reduce the number of multiplies.  If you want to
>>>>change
>>>> blocks "i" and "j", you thus do:
>>>>
>>>> Q_new = Q_old + g^i * (Di_old + Di_new) + g^j * (Dj_old + Dj_new)
>>>>
>>>> But as I say, I only know the maths - not the code.
>>> 
>>> The issue is where to store those intermediate Di_old + Di_new results
>>> without doubling the size of the stripe cache.
>>> 
>>
>>(As before, I haven't looked at the code.  I justify my laziness by
>>claiming that I might come up with fresh ideas without how to implement
>>things.  But only you folks can say what takes more work, and what is
>>riskier in the code.)
>>
>>I don't see that you would need to double the size of the stripe cache.
>> You might need an extra few block spaces, but not double the cache.
>>
>>Also, once you have done this calculation (assuming you did the easy
>>P_new first), you no longer need to keep Di_old lying around - it's
>>going to be replaced with the new stripe data.  So maybe you can do the
>>operation as "Di_old += Di_new" - i.e., in place and without using more
>>memory.  That is going to be faster too, as it is more cache friendly.
>>On the other hand, it might involve more locking or other tracking
>>mechanisms to avoid problems if something else is trying to access the
>>same caches.
>>
>
>Yes, I had seen your earlier mails on this topic and implemented it this
>way initially.
>However, this required us to either ask for stable pages or allocate an
>extra "spare" page
>for each disk in the array to hold the intermediate results.

Sorry, that should read -- one extra "spare" page per cpu for each disk
in the array.

>
>>
>

--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html




[Index of Archives]     [Linux RAID Wiki]     [ATA RAID]     [Linux SCSI Target Infrastructure]     [Linux Block]     [Linux IDE]     [Linux SCSI]     [Linux Hams]     [Device Mapper]     [Device Mapper Cryptographics]     [Kernel]     [Linux Admin]     [Linux Net]     [GFS]     [RPM]     [git]     [Yosemite Forum]


  Powered by Linux