On 21/11/2013 21:31, David Brown wrote:
On 21/11/13 21:05, Piergiorgio Sartor wrote:
Having a multi parity RAID allows to check
even which disk.
This would provide the user with a more
comprehensive (I forgot the spelling)
information.
Of course, since we are there, we can
also give the option to fix it.
This would be much likely a "fsck".
If this can all be done to give the user an informed choice, then it
sounds good.
One issue here is whether the check should be done with the filesystem
mounted and in use, or only off-line. If it is off-line then it will
mean a long down-time while the array is checked - but if it is online,
then there is the risk of confusing the filesystem and caches by
changing the data.
Non-existent issue imho, because if that stripe is changing, any error
will be corrected (overwritten), at least on the data disks (parity can
still be wrong if a shortcut-rmw method is used).
So you perform fsck for filesystem and data comes out good because any
error has been overwritten already, and fsck also returns noerror. Not
useful.
You have to consider only the case where the array check is performed
online, and the stripe does not change in the meanwhile, that means it
does not change for a long time, enough for you to complete all the checks.
debugfs techniques can tell you what filesystem element corresponds to a
certain block number, and this can be done online, with the filesystem
mounted.
I don't understand the thing you say about the caches. Caches are not an
obstacle for current "check" operation, so they also won't be a problem
for the new improved check operation which you are discussing.
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html