raid 5 recovery kernel warnings

Ryan Norwood <ryan.p.norwood@xxxxxxxxx> · Fri, 6 Jul 2018 14:35:40 -0400

We hit a RAID 5 issue during failure testing that caused a flood of kernel warnings and minor but problematic data corruption. 

Setup: RAID 5 with 7 drives + 1 hot spare 
OS: RHEL 7.5
Kernel: linux-3.10.0-862.3.3.el7

Scenario: We pulled a single data drive and the array automatically started its recovery process using the hot spare. We immediately became overwhelmed with the following kernel messages which choked the system.  Setting kernel.printk="2 4 1 7" did nothing to stop the messages. Once the repair was complete the machine became usable again. Once we were online again we noticed we had some minor data corruption.

We have been unable to reproduce this issue again.

[258091.028244] Workqueue: raid5wq raid5_do_work [raid456]
[258091.028245] Call Trace:
[258091.028248]  [<ffffffff9fb0e78e>] dump_stack+0x19/0x1b
[258091.028250]  [<ffffffff9f491998>] __warn+0xd8/0x100
[258091.028253]  [<ffffffff9f491add>] warn_slowpath_null+0x1d/0x20
[258091.028257]  [<ffffffffc0834677>] handle_stripe+0x2367/0x23f0 [raid456]
[258091.028258] systemd-journald[166084]: /dev/kmsg buffer overrun, some messages lost.
[258091.028262]  [<ffffffff9f72bad1>] ? blk_mq_sched_dispatch_requests+0x181/0x1c0
[258091.028266]  [<ffffffffc0834aad>] handle_active_stripes.isra.55+0x3ad/0x530 [raid456]
[258091.028271]  [<ffffffffc08354bf>] raid5_do_work+0x9f/0x150 [raid456]
[258091.028271] systemd-journald[166084]: /dev/kmsg buffer overrun, some messages lost.
[258091.028274]  [<ffffffff9f4b312f>] process_one_work+0x17f/0x440
[258091.028276]  [<ffffffff9f4b3df6>] worker_thread+0x126/0x3c0
[258091.028279]  [<ffffffff9f4b3cd0>] ? manage_workers.isra.24+0x2a0/0x2a0
[258091.028280]  [<ffffffff9f4bb161>] kthread+0xd1/0xe0
[258091.028281] systemd-journald[166084]: /dev/kmsg buffer overrun, some messages lost.
[258091.028284]  [<ffffffff9f4bb090>] ? insert_kthread_work+0x40/0x40
[258091.028287]  [<ffffffff9fb2065d>] ret_from_fork_nospec_begin+0x7/0x21
[258091.028290]  [<ffffffff9f4bb090>] ? insert_kthread_work+0x40/0x40
[258091.028291] systemd-journald[166084]: /dev/kmsg buffer overrun, some messages lost.
[258091.028292] ---[ end trace 3232975a123b52bf ]---
[258091.028297] ------------[ cut here ]------------
[258091.028301] WARNING: CPU: 4 PID: 155463 at drivers/md/raid5.c:4672 handle_stripe+0x2367/0x23f0 [raid456]

# cat /proc/mdstat 
Personalities : [raid6] [raid5] [raid4] [raid0] 
md126 : active raid0 md127[0]
      4657247232 blocks super 1.2 512k chunks

md127 : active raid5 dm-7[7](R) dm-6[6](F) dm-5[5] dm-4[4] dm-3[3] dm-2[2] dm-1[1] dm-0[0]
      4657379328 blocks super 1.2 level 5, 16k chunk, algorithm 2 [7/6] [UUUUUU_]
      [=>...................]  recovery =  9.6% (75100712/776229888) finish=601.8min speed=19414K/sec
      bitmap: 1/6 pages [4KB], 65536KB chunk

unused devices: <none>

drivers/md/raid5.c:4672

    /* maybe we need to check and possibly fix the parity for this stripe
     * Any reads will already have been scheduled, so we just see if enough
     * data is available.  The parity check is held off while parity
     * dependent operations are in flight.
     */
    if (sh->check_state ||
        (s.syncing && s.locked == 0 &&
         !test_bit(STRIPE_COMPUTE_RUN, &sh->state) &&
         !test_bit(STRIPE_INSYNC, &sh->state))) {
        if (conf->level == 6)
            handle_parity_checks6(conf, sh, &s, disks);
        else
            handle_parity_checks5(conf, sh, &s, disks);
    }

    if ((s.replacing || s.syncing) && s.locked == 0
        && !test_bit(STRIPE_COMPUTE_RUN, &sh->state)
        && !test_bit(STRIPE_REPLACED, &sh->state)) {
        /* Write out to replacement devices where possible */
        for (i = 0; i < conf->raid_disks; i++)
            if (test_bit(R5_NeedReplace, &sh->dev[i].flags)) {
                WARN_ON(!test_bit(R5_UPTODATE, &sh->dev[i].flags));
                set_bit(R5_WantReplace, &sh->dev[i].flags);
                set_bit(R5_LOCKED, &sh->dev[i].flags);
                s.locked++;
            }
        if (s.replacing)
            set_bit(STRIPE_INSYNC, &sh->state);
        set_bit(STRIPE_REPLACED, &sh->state);
    }

--
dm-devel mailing list
dm-devel@xxxxxxxxxx
https://www.redhat.com/mailman/listinfo/dm-devel