Re: A sector-of-mismatch warning patch (was Re: Fault tolerance with badblocks)

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 19 May 2017, NeilBrown verbalised:

> On Wed, May 17 2017, Shaohua Li wrote:
>
>> On Tue, May 16, 2017 at 10:46:13PM +0100, Nix wrote:
>>> Doesn't that already mean that someone has explicitly triggered a check
>>> action?
>>
>> So the idea is: run 'check' and report mismatch, userspace (raid6check for
>> example) uses the reported info to fix the mismatch. The pr_warn_ratelimited
>> isn't a good way to communicate the info to userspace. I'm wondering why we
>> don't just run raid6check solely, it can do the job like what kernel does and
>> we avoid the crappy pr_warn_ratelimited.

It'll do when there are a few inconsistencies but you don't want to
spend days recovering a huge array to fix a small but nonzero
mismatch_cnt, or to reassure you that yes, these mismatch_cnts are in
swap, ignore them. When there are a lot, enough that a ratelimited
warning hits its rate limit, Neil's right: the array is probably toast.
The limit is then important to stop log flooding.

> If we really wanted a seamless "fix the raid6 thing" (which I don't
> think we do),

Oh, I want seamless everything -- the seamlessness and flexibility of md
are its killer features over hardware RAID in my eyes -- but I'm
convinced that this is probably too hard to test and simply too
disruptive to bother with for a likely vanishingly rare failure mode all
entangled with fairly hot paths.

>               we'd probably make the list of inconsistencies appear in a
> sysfs file.  That would be less 'crappy'.  But as I say, I don't think
> we really want to do that.

Aren't sysfs files in effect length-limited to one page (or at least
length-limited by virtue of being stored in memory?) It seems to me this
would just bring the same problem ratelimit is solving right back again,
except a sysfs file doesn't have a logging daemon sucking the contents
out constantly so you can overwrite your old output without worrying.
(And there is no other daemon running to do that, except mdadm in
monitor mode, which might not be running and really this job feels out
of scope for it anyway.)
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [Linux RAID Wiki]     [ATA RAID]     [Linux SCSI Target Infrastructure]     [Linux Block]     [Linux IDE]     [Linux SCSI]     [Linux Hams]     [Device Mapper]     [Device Mapper Cryptographics]     [Kernel]     [Linux Admin]     [Linux Net]     [GFS]     [RPM]     [git]     [Yosemite Forum]


  Powered by Linux