Re: Stopping raid6 (with journal) hangs w/ 100%CPU

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



After hard-rebooting, this instance (stripe_cache_active: 2) assembled just fine on boot. The next time I encountered this the array was 'inactive' on boot. There was a flurry of I/O initially (which seems to indicate journal re-play, then the array becoming 'active') but the I/O ceased without the array becoming active.

This time...

stripe_cache_active: 2376

md125 : inactive md127p4[9](J) sdk1[2] sdl1[3] sdn1[5] sdo1[6] sdm1[4] sdj1[1] sdq1[8] sdp1[7]
      31258219068 blocks super 1.2


# mdadm -D /dev/md125
/dev/md125:
           Version : 1.2
     Creation Time : Thu Oct 19 10:11:35 2017
        Raid Level : raid6
     Used Dev Size : 18446744073709551615
      Raid Devices : 8
     Total Devices : 9
       Persistence : Superblock is persistent

       Update Time : Fri Nov 24 13:41:38 2017
             State : active, FAILED, Not Started
    Active Devices : 8
   Working Devices : 9
    Failed Devices : 0
     Spare Devices : 0

            Layout : left-symmetric
        Chunk Size : 64K

Consistency Policy : journal

              Name : ########:3
              UUID : de6a2ce0:1a4c510f:d7c89da4:1215a312
            Events : 156844

    Number   Major   Minor   RaidDevice State
       -       0        0        0      removed
       -       0        0        1      removed
       -       0        0        2      removed
       -       0        0        3      removed
       -       0        0        4      removed
       -       0        0        5      removed
       -       0        0        6      removed
       -       0        0        7      removed

       -     259        3        -      spare   /dev/md127p4
       -       8      225        5      sync   /dev/sdo1
       -       8      209        4      sync   /dev/sdn1
       -       8      193        3      sync   /dev/sdm1
       -       8      177        2      sync   /dev/sdl1
       -       8      161        1      sync   /dev/sdk1
       -       8      145        0      sync   /dev/sdj1
       -      65        1        7      sync   /dev/sdq1
       -       8      241        6      sync   /dev/sdp1

--Larkin

On 11/23/2017 1:22 PM, Larkin Lowrey wrote:
Sometimes, stopping a raid6 array (with journal) hangs, the mdX_raid6 process pegs at 100% CPU, and there is no I/O. Looks like it's stuck in an infinite loop.

Kernel: 4.13.13-200.fc26.x86_64

The stack trace (echo l > /proc/sysrq-trigger) is always the same:

handle_stripe+0x10c/0x2140 [raid456]
? pick_next_task_fair+0x491/0x550
handle_active_stripes.isra.60+0x3e5/0x5a0 [raid456]
raid5d+0x42e/0x630 [raid456]
? prepare_to_wait_event+0x79/0x160
md_thread+0x125/0x170
? md_thread+0x125/0x170
? finish_wait+0x80/0x80
kthread+0x125/0x140
? state_show+0x2f0/0x2f0
? kthread_park+0x60/0x60
? do_syscall_64+0x67/0x140
ret_from_fork+0x25/0x30

The array is healthy, has a journal, and writes were idle for several minutes prior to running 'mdadm --stop'.

md124 : active raid6 sdt1[6] sds1[5] sdw1[1] sdx1[2] sdy1[3] sdu1[7] sdv1[8] sdz1[4] md125p4[9](J)       23442092928 blocks super 1.2 level 6, 64k chunk, algorithm 2 [8/8] [UUUUUUUU]

stripe_cache_active: 2
stripe_cache_size: 32768
array_state: write-pending
journal_mode: write-through [write-back]
consistency_policy: journal

--Larkin

--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [Linux RAID Wiki]     [ATA RAID]     [Linux SCSI Target Infrastructure]     [Linux Block]     [Linux IDE]     [Linux SCSI]     [Linux Hams]     [Device Mapper]     [Device Mapper Cryptographics]     [Kernel]     [Linux Admin]     [Linux Net]     [GFS]     [RPM]     [git]     [Yosemite Forum]


  Powered by Linux