Broken raid6 continually resyncing

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



I was trying to add an extra device to expand my raid6, and as noted
at https://raid.wiki.kernel.org/index.php/Growing I got a resource
busy message so I tried disabling the internal bitmap with mdadm
--grow --bitmap=none /dev/md0. This command hung and I eventually
restarted the machine. The raid caused the boot to fail, so I had to
remove it from fstab and try again.

I've now created overlay devices as suggested by
https://raid.wiki.kernel.org/index.php/Recovering_a_damaged_RAID so I
can fix it without worrying about corrupting it further.

When assembled this is what /proc/mdstat shows:

Personalities : [raid6] [raid5] [raid4] [linear] [multipath] [raid0]
[raid1] [raid10]
md0 : active raid6 dm-8[8] dm-6[9](S) dm-7[5] dm-5[6] dm-4[3] dm-3[7]
      0 blocks super 1.2 level 6, 512k chunk, algorithm 2 [5/5] [UUUUU]
       resync=PENDING
      bitmap: 0/0 pages [0KB], 65536KB chunk

unused devices: <none>

I tried mdadm --readwrite /dev/md0, but it fails with "Device or
resource busy". Looking at dmesg I see this:

[40705.795439] md: md0 stopped.
[40705.797433] md/raid:md0: not clean -- starting background reconstruction
[40705.797451] md/raid:md0: device dm-8 operational as raid disk 0
[40705.797452] md/raid:md0: device dm-7 operational as raid disk 4
[40705.797453] md/raid:md0: device dm-5 operational as raid disk 3
[40705.797454] md/raid:md0: device dm-4 operational as raid disk 2
[40705.797454] md/raid:md0: device dm-3 operational as raid disk 1
[40705.798141] md/raid:md0: raid level 6 active with 5 out of 5
devices, algorithm 2
[40705.809168] md: resync of RAID array md0
[40705.809175] md: md0: resync done.
[40705.809185] ------------[ cut here ]------------
[40705.809189] WARNING: CPU: 6 PID: 28586 at
/build/linux-KPInqg/linux-4.13.0/drivers/md/md.c:7582
md_seq_show+0x799/0x7b0
[40705.809189] Modules linked in: dm_snapshot dm_bufio rfcomm
nf_conntrack_netlink nfnetlink xfrm_user xfrm_algo xt_addrtype
xt_conntrack br_netfilter aufs xt_CHECKSUM iptable_mangle xt_tcpudp
ipt_MASQUERADE nf_nat_masquerade_ipv4 xt_comment iptable_nat
nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack
bridge stp llc iptable_filter pci_stub vboxpci(OE) vboxnetadp(OE)
vboxnetflt(OE) vboxdrv(OE) cmac bnep binfmt_misc nls_iso8859_1 mxm_wmi
nvidia_uvm(POE) snd_hda_codec_realtek snd_hda_codec_generic intel_rapl
snd_hda_codec_hdmi x86_pkg_temp_thermal intel_powerclamp coretemp
snd_hda_intel kvm_intel snd_usb_audio snd_hda_codec kvm
snd_usbmidi_lib input_leds snd_hda_core joydev snd_hwdep snd_pcm btusb
btrtl snd_seq_midi snd_seq_midi_event irqbypass intel_cstate
intel_rapl_perf snd_rawmidi snd_seq
[40705.809215]  snd_seq_device snd_timer snd serio_raw soundcore
shpchp mei_me mei hci_uart btbcm serdev btqca btintel bluetooth wmi
ecdh_generic intel_lpss_acpi acpi_als intel_lpss kfifo_buf
industrialio mac_hid acpi_pad parport_pc ppdev lp parport
binder_linux(OE) ashmem_linux(OE) ip_tables x_tables autofs4
algif_skcipher af_alg dm_crypt raid10 raid1 raid0 multipath linear
raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor
hid_generic usbhid raid6_pq libcrc32c amdkfd amd_iommu_v2
crct10dif_pclmul nvidia_drm(POE) crc32_pclmul nvidia_modeset(POE)
ghash_clmulni_intel radeon pcbc nvidia(POE) aesni_intel i2c_algo_bit
aes_x86_64 crypto_simd ttm glue_helper cryptd drm_kms_helper
syscopyarea e1000e sysfillrect sysimgblt fb_sys_fops ptp pps_core ahci
drm libahci video i2c_hid hid
[40705.809250] CPU: 6 PID: 28586 Comm: mdadm Tainted: P        W  OE
4.13.0-46-generic #51-Ubuntu
[40705.809250] Hardware name: MSI MS-7A72/B250 PC MATE (MS-7A72), BIOS
3.40 04/07/2017
[40705.809251] task: ffff972b09b98000 task.stack: ffffa8b6644a4000
[40705.809253] RIP: 0010:md_seq_show+0x799/0x7b0
[40705.809253] RSP: 0018:ffffa8b6644a7d98 EFLAGS: 00010246
[40705.809255] RAX: 0000000000000003 RBX: ffff972b04285580 RCX: 0000000000000007
[40705.809255] RDX: 0000000000000007 RSI: ffffffffaa149440 RDI: 0000000000000000
[40705.809256] RBP: ffffa8b6644a7e08 R08: 0000000000001000 R09: ffff97288cec20e2
[40705.809256] R10: 0000000000000001 R11: ffff97288cec20db R12: ffff972a86eda800
[40705.809257] R13: 0000000000000000 R14: 0000000000000003 R15: ffff972a86eda818
[40705.809258] FS:  00007fb96aac4740(0000) GS:ffff972b2ed80000(0000)
knlGS:0000000000000000
[40705.809259] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[40705.809259] CR2: 0000556563359888 CR3: 000000066c12c004 CR4: 00000000003606e0
[40705.809260] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[40705.809261] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[40705.809261] Call Trace:
[40705.809264]  ? __kmalloc_node+0x1f8/0x2a0
[40705.809267]  seq_read+0x316/0x420
[40705.809269]  proc_reg_read+0x45/0x70
[40705.809271]  __vfs_read+0x1b/0x40
[40705.809272]  vfs_read+0x8e/0x130
[40705.809274]  SyS_read+0x55/0xc0
[40705.809276]  do_syscall_64+0x67/0x130
[40705.809278]  entry_SYSCALL64_slow_path+0x25/0x25
[40705.809279] RIP: 0033:0x7fb96a3ef021
[40705.809280] RSP: 002b:00007ffc3a40b798 EFLAGS: 00000246 ORIG_RAX:
0000000000000000
[40705.809281] RAX: ffffffffffffffda RBX: 000055569b995290 RCX: 00007fb96a3ef021
[40705.809282] RDX: 0000000000000400 RSI: 000055569b9a1310 RDI: 0000000000000004
[40705.809282] RBP: 0000000000000d68 R08: 0000000000000000 R09: 000055569b9a1310
[40705.809283] R10: ffffffffffffffb0 R11: 0000000000000246 R12: 00007fb96a6c23e0
[40705.809284] R13: 00007fb96a6c18a0 R14: 000055569b995290 R15: 0000000000000003
[40705.809284] Code: f9 ff ff 48 c7 c6 51 94 14 aa 48 89 df e8 50 6d
b3 ff e9 38 fb ff ff 48 c7 c6 9e 94 14 aa 48 89 df e8 3c 6d b3 ff e9
33 fb ff ff <0f> ff e9 97 fc ff ff e8 fb eb 93 ff 90 66 2e 0f 1f 84 00
00 00
[40705.809304] ---[ end trace 026f8eea016638a2 ]---
[40705.827321] md: resync of RAID array md0
[40705.827326] md: md0: resync done.
[40705.840846] md: resync of RAID array md0
[40705.840851] md: md0: resync done.
[40705.853497] md: resync of RAID array md0
[40705.853502] md: md0: resync done.
[40705.861679] md: resync of RAID array md0
[40705.861685] md: md0: resync done.
[40705.871071] md: resync of RAID array md0
[40705.871076] md: md0: resync done.
[40705.880970] md: resync of RAID array md0
[40705.880975] md: md0: resync done.
[40705.887774] md: resync of RAID array md0
[40705.887778] md: md0: resync done.

With the last 2 lines repeating many times every second until I stop the array.

Output of mdadm --examine:

/dev/mapper/sdb1:
          Magic : a92b4efc
        Version : 1.2
    Feature Map : 0x1
     Array UUID : 6e18cb72:9920468c:1ae4b919:70c02b22
           Name : REDACTED:0  (local to host REDACTED)
  Creation Time : Thu Aug 21 20:49:41 2014
     Raid Level : raid6
   Raid Devices : 5

 Avail Dev Size : 5860268032 (2794.39 GiB 3000.46 GB)
     Array Size : 0
  Used Dev Size : 0
    Data Offset : 262144 sectors
   Super Offset : 8 sectors
   Unused Space : before=262056 sectors, after=5860268032 sectors
          State : active
    Device UUID : 17a702bd:e669a341:4f0149b4:c82aed47

Internal Bitmap : 8 sectors from superblock
    Update Time : Mon Aug  6 10:51:23 2018
  Bad Block Log : 512 entries available at offset 72 sectors
       Checksum : ba2b5fb0 - correct
         Events : 1368393

         Layout : left-symmetric
     Chunk Size : 512K

   Device Role : Active device 0
   Array State : AAAAA ('A' == active, '.' == missing, 'R' == replacing)
/dev/mapper/sdc1:
          Magic : a92b4efc
        Version : 1.2
    Feature Map : 0x1
     Array UUID : 6e18cb72:9920468c:1ae4b919:70c02b22
           Name : REDACTED:0  (local to host REDACTED)
  Creation Time : Thu Aug 21 20:49:41 2014
     Raid Level : raid6
   Raid Devices : 5

 Avail Dev Size : 5860268032 (2794.39 GiB 3000.46 GB)
     Array Size : 0
  Used Dev Size : 0
    Data Offset : 262144 sectors
   Super Offset : 8 sectors
   Unused Space : before=262056 sectors, after=5860268032 sectors
          State : active
    Device UUID : 283e0ef6:c7cbda49:8a8ed139:bf15f83e

Internal Bitmap : 8 sectors from superblock
    Update Time : Mon Aug  6 10:51:23 2018
  Bad Block Log : 512 entries available at offset 72 sectors
       Checksum : 7801d0d3 - correct
         Events : 1368393

         Layout : left-symmetric
     Chunk Size : 512K

   Device Role : Active device 1
   Array State : AAAAA ('A' == active, '.' == missing, 'R' == replacing)
/dev/mapper/sdd1:
          Magic : a92b4efc
        Version : 1.2
    Feature Map : 0x1
     Array UUID : 6e18cb72:9920468c:1ae4b919:70c02b22
           Name : REDACTED:0  (local to host REDACTED)
  Creation Time : Thu Aug 21 20:49:41 2014
     Raid Level : raid6
   Raid Devices : 5

 Avail Dev Size : 5860268032 (2794.39 GiB 3000.46 GB)
     Array Size : 0
  Used Dev Size : 0
    Data Offset : 262144 sectors
   Super Offset : 8 sectors
   Unused Space : before=262064 sectors, after=5860268032 sectors
          State : active
    Device UUID : a49db844:7b082f6c:5f01f998:e3167bc9

Internal Bitmap : 8 sectors from superblock
    Update Time : Mon Aug  6 10:51:23 2018
       Checksum : d2a2e0b0 - correct
         Events : 1368393

         Layout : left-symmetric
     Chunk Size : 512K

   Device Role : Active device 2
   Array State : AAAAA ('A' == active, '.' == missing, 'R' == replacing)
/dev/mapper/sde1:
          Magic : a92b4efc
        Version : 1.2
    Feature Map : 0x1
     Array UUID : 6e18cb72:9920468c:1ae4b919:70c02b22
           Name : REDACTED:0  (local to host REDACTED)
  Creation Time : Thu Aug 21 20:49:41 2014
     Raid Level : raid6
   Raid Devices : 5

 Avail Dev Size : 5860268032 (2794.39 GiB 3000.46 GB)
     Array Size : 0
  Used Dev Size : 0
    Data Offset : 262144 sectors
   Super Offset : 8 sectors
   Unused Space : before=262056 sectors, after=5860268032 sectors
          State : active
    Device UUID : 64a73690:1375d9f4:a558ece4:c0f1f6e1

Internal Bitmap : 8 sectors from superblock
    Update Time : Mon Aug  6 10:51:23 2018
  Bad Block Log : 512 entries available at offset 72 sectors
       Checksum : b428978 - correct
         Events : 1368393

         Layout : left-symmetric
     Chunk Size : 512K

   Device Role : Active device 3
   Array State : AAAAA ('A' == active, '.' == missing, 'R' == replacing)
/dev/mapper/sdf1:
          Magic : a92b4efc
        Version : 1.2
    Feature Map : 0x1
     Array UUID : 6e18cb72:9920468c:1ae4b919:70c02b22
           Name : REDACTED:0  (local to host REDACTED)
  Creation Time : Thu Aug 21 20:49:41 2014
     Raid Level : raid6
   Raid Devices : 5

 Avail Dev Size : 5860268032 (2794.39 GiB 3000.46 GB)
     Array Size : 0
  Used Dev Size : 0
    Data Offset : 262144 sectors
   Super Offset : 8 sectors
   Unused Space : before=262056 sectors, after=5860268032 sectors
          State : active
    Device UUID : 2a40a3e3:a9ed7d48:12ed27a8:c9390cb7

Internal Bitmap : 8 sectors from superblock
    Update Time : Mon Aug  6 10:51:23 2018
  Bad Block Log : 512 entries available at offset 72 sectors
       Checksum : 4aa47748 - correct
         Events : 1368393

         Layout : left-symmetric
     Chunk Size : 512K

   Device Role : Active device 4
   Array State : AAAAA ('A' == active, '.' == missing, 'R' == replacing)
/dev/mapper/sdg1:
          Magic : a92b4efc
        Version : 1.2
    Feature Map : 0x1
     Array UUID : 6e18cb72:9920468c:1ae4b919:70c02b22
           Name : REDACTED:0  (local to host REDACTED)
  Creation Time : Thu Aug 21 20:49:41 2014
     Raid Level : raid6
   Raid Devices : 5

 Avail Dev Size : 5860268032 (2794.39 GiB 3000.46 GB)
     Array Size : 0
  Used Dev Size : 0
    Data Offset : 262144 sectors
   Super Offset : 8 sectors
   Unused Space : before=262064 sectors, after=5860268032 sectors
          State : active
    Device UUID : 008f705a:a56e7072:d2bded70:5f5669cb

Internal Bitmap : 8 sectors from superblock
    Update Time : Mon Aug  6 10:51:23 2018
  Bad Block Log : 512 entries available at offset 24 sectors
       Checksum : c8873443 - correct
         Events : 1368393

         Layout : left-symmetric
     Chunk Size : 512K

   Device Role : spare
   Array State : AAAAA ('A' == active, '.' == missing, 'R' == replacing)

Output of mdadm --detail:

/dev/md0:
        Version : 1.2
  Creation Time : Thu Aug 21 20:49:41 2014
     Raid Level : raid6
  Used Dev Size : unknown
   Raid Devices : 5
  Total Devices : 6
    Persistence : Superblock is persistent

  Intent Bitmap : Internal

    Update Time : Mon Aug  6 10:41:10 2018
          State : clean, resyncing, Not Started (PENDING)
 Active Devices : 5
Working Devices : 6
 Failed Devices : 0
  Spare Devices : 1

         Layout : left-symmetric
     Chunk Size : 512K

           Name : REDACTED:0  (local to host REDACTED)
           UUID : 6e18cb72:9920468c:1ae4b919:70c02b22
         Events : 1230728

    Number   Major   Minor   RaidDevice State
       8     253        8        0      active sync   /dev/dm-8
       7     253        3        1      active sync   /dev/dm-3
       3     253        4        2      active sync   /dev/dm-4
       6     253        5        3      active sync   /dev/dm-5
       5     253        7        4      active sync   /dev/dm-7

       9     253        6        -      spare   /dev/dm-6

Kernel 4.13.0-46-generic, mdadm 4.0.

Since dmesg shows a large number of resyncs happening every second,
I'm thinking there might be a bug somewhere and not just something
wrong with my array. Any ideas on what I can do to fix this?

Thanks,
Zach
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [Linux RAID Wiki]     [ATA RAID]     [Linux SCSI Target Infrastructure]     [Linux Block]     [Linux IDE]     [Linux SCSI]     [Linux Hams]     [Device Mapper]     [Device Mapper Cryptographics]     [Kernel]     [Linux Admin]     [Linux Net]     [GFS]     [RPM]     [git]     [Yosemite Forum]


  Powered by Linux