On Tue, Aug 20, 2019 at 7:30 AM Nigel Croxon <ncroxon@xxxxxxxxxx> wrote: > > > On 8/16/19 7:52 PM, Song Liu wrote: > > On Fri, Aug 16, 2019 at 10:02 AM Nigel Croxon <ncroxon@xxxxxxxxxx> wrote: > > [...] > >> [ +0.000008] md/raid:md127: 793 read_errors, > 781 stripes > >> [ +0.000001] md/raid:md127: Too many read errors, failing device dm-0. > >> [ +0.000018] md/raid:md127: 794 read_errors, > 781 stripes > >> [ +0.000000] md/raid:md127: Too many read errors, failing device dm-0. > >> [ +0.000009] md/raid:md127: 795 read_errors, > 781 stripes > >> [ +0.000001] md/raid:md127: Too many read errors, failing device dm-0. > >> [ +0.000008] md/raid:md127: 796 read_errors, > 781 stripes > >> [ +0.000000] md/raid:md127: Too many read errors, failing device dm-0. > >> [ +0.000018] md/raid:md127: 797 read_errors, > 781 stripes > >> [ +0.000001] md/raid:md127: Too many read errors, failing device dm-0. > >> [ +0.000008] md/raid:md127: 798 read_errors, > 781 stripes > >> [ +0.000001] md/raid:md127: Too many read errors, failing device dm-0. > >> [ +0.000017] md/raid:md127: 799 read_errors, > 781 stripes > >> [ +0.000001] md/raid:md127: Too many read errors, failing device dm-0. > >> [ +0.000008] md/raid:md127: 800 read_errors, > 781 stripes > >> [ +0.000001] md/raid:md127: Too many read errors, failing device dm-0. > >> [ +0.000008] md/raid:md127: 801 read_errors, > 781 stripes > >> [ +0.000000] md/raid:md127: Too many read errors, failing device dm-0. > >> [ +0.000021] md/raid:md127: 802 read_errors, > 781 stripes > >> [ +0.000000] md/raid:md127: Too many read errors, failing device dm-0. > >> [ +0.000009] md/raid:md127: 803 read_errors, > 781 stripes > >> [ +0.000000] md/raid:md127: Too many read errors, failing device dm-0. > >> [ +0.000009] md/raid:md127: 804 read_errors, > 781 stripes > >> [ +0.000000] md/raid:md127: Too many read errors, failing device dm-0. > >> [ +0.000008] md/raid:md127: 805 read_errors, > 781 stripes > >> [ +0.000001] md/raid:md127: Too many read errors, failing device dm-0. > >> [ +0.928614] md: md127: requested-resync interrupted. > >> > > This is a little too noisy. How about we only pr_warn() for > > test_bit(Faulty) == 0? > > (This is not directly related to this patch, but since we are at it). > > > > Thanks, > > Song > From: Nigel Croxon <ncroxon@xxxxxxxxxx> > Date: Mon, 19 Aug 2019 16:01:04 -0400 > Subject: [PATCH] raid5 improve too many read errors msg by adding limits > > Often limits can be changed by admin. When discussing such things > it helps if you can provide "self-sustained" facts. Also > sometimes the admin thinks he changed a limit, but it did not > take effect for some reason or he changed the wrong thing. > > V3: Only pr_warn when Faulty is 0. > V2: Add read_errors value to pr_warn. > > Signed-off-by: Nigel Croxon <ncroxon@xxxxxxxxxx> > --- > drivers/md/raid5.c | 13 +++++++++---- > 1 file changed, 9 insertions(+), 4 deletions(-) > > diff --git a/drivers/md/raid5.c b/drivers/md/raid5.c > index 7fde645d2e90..6812cefea308 100644 > --- a/drivers/md/raid5.c > +++ b/drivers/md/raid5.c > @@ -2557,10 +2557,15 @@ static void raid5_end_read_request(struct bio * bi) > (unsigned long long)s, > bdn); > } else if (atomic_read(&rdev->read_errors) > - > conf->max_nr_stripes) > - pr_warn("md/raid:%s: Too many read errors, failing device > %s.\n", > - mdname(conf->mddev), bdn); > - else > + > conf->max_nr_stripes) { > + if (!test_bit(Faulty, &rdev->flags)) { > + pr_warn("md/raid:%s: %d read_errors, > %d stripes\n", > + mdname(conf->mddev), atomic_read(&rdev->read_errors), > + conf->max_nr_stripes); > + pr_warn("md/raid:%s: Too many read errors, failing > device %s.\n", > + mdname(conf->mddev), bdn); > + } > + } else > retry = 1; > if (set_bad && test_bit(In_sync, &rdev->flags) > && !test_bit(R5_ReadNoMerge, &sh->dev[i].flags)) > -- This looks good, but I have got some git issue applying the patch. Please double check with ./scripts/checkpatch.pl and resend with git-send-email. Thanks, Song