Re: failed reshape!

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, 9 Dec 2011 08:53:42 -0500 Gavin  Peters (蓋文彼德斯) <gavin@xxxxxx>
wrote:

> I tried to reshape today, a raid6 array from seven devices up to
> eight.  I ran mdadm 3.2.2, something like
> 
> # mdadm /dev/md2 --grow -n 8 --layout=preserve
> 
> and then, blammo!
> 
> Dec  8 22:30:10 avclub kernel: [  527.094708] RAID5 conf printout:
> Dec  8 22:30:10 avclub kernel: [  527.094712]  --- rd:8 wd:8
> Dec  8 22:30:10 avclub kernel: [  527.094714]  disk 0, o:1, dev:sdc6
> Dec  8 22:30:10 avclub kernel: [  527.094715]  disk 1, o:1, dev:sdf6
> Dec  8 22:30:10 avclub kernel: [  527.094717]  disk 2, o:1, dev:sda6
> Dec  8 22:30:10 avclub kernel: [  527.094718]  disk 3, o:1, dev:sdd6
> Dec  8 22:30:10 avclub kernel: [  527.094719]  disk 4, o:1, dev:sdb6
> Dec  8 22:30:10 avclub kernel: [  527.094720]  disk 5, o:1, dev:sde6
> Dec  8 22:30:10 avclub kernel: [  527.094721]  disk 6, o:1, dev:sdg6
> Dec  8 22:30:10 avclub kernel: [  527.094722]  disk 7, o:1, dev:sdh6
> Dec  8 22:30:10 avclub kernel: [  527.094876] md: reshape of RAID array md2
> Dec  8 22:30:10 avclub kernel: [  527.094886] md: minimum _guaranteed_
>  speed: 40000 KB/sec/disk.
> Dec  8 22:30:10 avclub kernel: [  527.094892] md: using maximum
> available idle IO bandwidth (but not more than 200000 KB/sec) for
> reshape.
> Dec  8 22:30:10 avclub kernel: [  527.094912] md: using 128k window,
> over a total of 1371476928 blocks.
> Dec  8 22:30:11 avclub mdadm[2959]: RebuildStarted event detected on
> md device /dev/md2
> Dec  8 22:30:11 avclub kernel: [  527.515359] general protection
> fault: 0000 [#1] SMP
> Dec  8 22:30:11 avclub kernel: [  527.515370] last sysfs file:
> /sys/devices/virtual/block/md2/md/sync_speed
> Dec  8 22:30:11 avclub kernel: [  527.515376] CPU 5
> Dec  8 22:30:11 avclub kernel: [  527.515381] Modules linked in:
> binfmt_misc nfsd exportfs nfs lockd nfs_acl auth_rpcgss sunrpc
> snd_usb_audio snd_usb_lib snd_hda_codec_atihdmi fbcon tileblit font
> bitblit softcursor
>  vga16fb vgastate snd_hda_codec_via snd_hda_intel snd_pcm_oss
> snd_hda_codec snd_mixer_
> oss snd_hwdep snd_pcm snd_seq_dummy snd_seq_oss snd_seq_midi
> snd_rawmidi snd_seq_midi_event snd_seq snd_timer snd_seq_device radeon
> ttm asus_atk0110 drm_kms_helper ppdev snd drm i2c_algo_bit parport_pc
> edac_core edac_mce_amd gspca_zc3xx gspca_main videodev v4l1_compat
> v4l2_compat_ioctl32 soundcore snd_page_alloc i2c_piix4 shpchp lp
> parport tcp_vegas raid10 raid456 async_pq async_xor xor async_memcpy
> usbhid async_raid6_recov hid raid6_pq async_tx raid1 raid0 pata_atiixp
> r8169 mii multipath ahci linear [last unloaded: kvm]
> Dec  8 22:30:11 avclub kernel: [  527.515500] Pid: 528, comm:
> md2_raid6 Not tainted 2.6.32-32-generic #62-Ubuntu System Product Name
> Dec  8 22:30:11 avclub kernel: [  527.515507] RIP:
> 0010:[<ffffffff812be15b>]  [<ffffffff812be15b>] memcpy_c+0xb/0x20
> Dec  8 22:30:11 avclub kernel: [  527.515526] RSP:
> 0018:ffff880408985c18  EFLAGS: 00010246
> Dec  8 22:30:11 avclub kernel: [  527.515531] RAX: db73880000000000
> RBX: ffff880408984000 RCX: 0000000000000200
> Dec  8 22:30:11 avclub kernel: [  527.515537] RDX: 0000000000000000
> RSI: ffff880369717000 RDI: db73880000000000
> Dec  8 22:30:11 avclub kernel: [  527.515543] RBP: ffff880408985c80
> R08: 0000000000001000 R09: ffff880408985ca0
> Dec  8 22:30:11 avclub kernel: [  527.515548] R10: 0000000000000000
> R11: 0000000000000000 R12: ffff880408985ca0
> Dec  8 22:30:11 avclub kernel: [  527.515553] R13: ffff880369741290
> R14: 0000000000000000 R15: 0000000000000000
> Dec  8 22:30:11 avclub kernel: [  527.515560] FS:
> 00007f465923d7a0(0000) GS:ffff880028340000(0000)
> knlGS:00000000f6990760
> Dec  8 22:30:11 avclub kernel: [  527.515566] CS:  0010 DS: 0018 ES:
> 0018 CR0: 000000008005003b
> Dec  8 22:30:11 avclub kernel: [  527.515571] CR2: 00007fe6aaf92000
> CR3: 00000003c3a5e000 CR4: 00000000000006e0
> Dec  8 22:30:11 avclub kernel: [  527.515576] DR0: 0000000000000000
> DR1: 0000000000000000 DR2: 0000000000000000
> Dec  8 22:30:11 avclub kernel: [  527.515582] DR3: 0000000000000000
> DR6: 00000000ffff0ff0 DR7: 0000000000000400
> Dec  8 22:30:11 avclub kernel: [  527.515589] Process md2_raid6 (pid:
> 528, threadinfo ffff880408984000, task ffff88040b210000)
> Dec  8 22:30:11 avclub kernel: [  527.515593] Stack:
> Dec  8 22:30:11 avclub kernel: [  527.515596]  ffffffffa004a0e7
> ffff880408985c50 0000000000000000 0000000000000000
> Dec  8 22:30:11 avclub kernel: [  527.515604] <0> ffffea000bf10d08
> 0000000000000000 0000000000001000 ffff880408985c80
> Dec  8 22:30:11 avclub kernel: [  527.515614] <0> 0000000000000000
> ffff8803696a6930 ffff880369741290 ffff880408985d70
> Dec  8 22:30:11 avclub kernel: [  527.515624] Call Trace:
> Dec  8 22:30:11 avclub kernel: [  527.515639]  [<ffffffffa004a0e7>] ?
> async_memcpy+0xe7/0x25c [async_memcpy]
> Dec  8 22:30:11 avclub kernel: [  527.515654]  [<ffffffffa00aaabb>]
> handle_stripe_expansion+0x14b/0x1e0 [raid456]
> Dec  8 22:30:11 avclub kernel: [  527.515668]  [<ffffffffa00ab113>]
> handle_stripe6+0x5c3/0xb40 [raid456]
> Dec  8 22:30:11 avclub kernel: [  527.515680]  [<ffffffffa00a794c>] ?
> __release_stripe+0xcc/0x1c0 [raid456]
> Dec  8 22:30:11 avclub kernel: [  527.515692]  [<ffffffffa00ac055>]
> handle_stripe+0x25/0x30 [raid456]
> Dec  8 22:30:11 avclub kernel: [  527.515703]  [<ffffffffa00ac452>]
> raid5d+0x202/0x320 [raid456]
> Dec  8 22:30:11 avclub kernel: [  527.515716]  [<ffffffff815416b9>] ?
> _spin_unlock_irqrestore+0x19/0x30
> Dec  8 22:30:11 avclub kernel: [  527.515725]  [<ffffffff8141704c>]
> md_thread+0x5c/0x130
> Dec  8 22:30:11 avclub kernel: [  527.515735]  [<ffffffff81084cb0>] ?
> autoremove_wake_function+0x0/0x40
> Dec  8 22:30:11 avclub kernel: [  527.515743]  [<ffffffff81416ff0>] ?
> md_thread+0x0/0x130
> Dec  8 22:30:11 avclub kernel: [  527.515750]  [<ffffffff81084936>]
> kthread+0x96/0xa0
> Dec  8 22:30:11 avclub kernel: [  527.515758]  [<ffffffff810131ea>]
> child_rip+0xa/0x20
> Dec  8 22:30:11 avclub kernel: [  527.515766]  [<ffffffff810848a0>] ?
> kthread+0x0/0xa0
> Dec  8 22:30:11 avclub kernel: [  527.515772]  [<ffffffff810131e0>] ?
> child_rip+0x0/0x20
> Dec  8 22:30:11 avclub kernel: [  527.515776] Code: 81 ea d8 1f 00 00
> 48 3b 42 20 73 07 48 8b 50 f9 31 c0 c3 31 d2 48 c7 c0 f2 ff ff ff c3
> 90 90 90 48 89 f8 89 d1 c1 e9 03 83 e2 07 <f3> 48 a5 89 d1 f3 a4 c3 66
> 66 66 66 2e 0f 1f 84 00 00 00 00 00
> Dec  8 22:30:11 avclub kernel: [  527.515842] RIP
> [<ffffffff812be15b>] memcpy_c+0xb/0x20
> Dec  8 22:30:11 avclub kernel: [  527.515850]  RSP <ffff880408985c18>
> Dec  8 22:30:11 avclub kernel: [  527.515857] ---[ end trace
> 5146b1cc8ebe8dc1 ]---
> Dec  8 22:30:11 avclub kernel: [  527.515865] note: md2_raid6[528]
> exited with preempt_count 2
> Dec  8 22:32:52 avclub kernel: Kernel logging (proc) stopped.
> 
> I believe that last line shows me giving up.  I am sad.
> Thankfully, after rebooting into single user mode, I was able to mdadm
> --assemble the array, and it appears to be working.  Boy that was a
> rush!
> $ uname -aLinux avclub 2.6.32-32-generic #62-Ubuntu SMP Wed Apr 20
> 21:52:38 UTC 2011 x86_64 GNU/Linux
> Let me know if I can provide any other information.
> 

Thanks for the report.

It seems that as part of the reshape, md is trying to copy to an invalid
memory address.
It copies from 0xffff880369717000 (RSI) to 0xdb73880000000000 (rdi).
The latter is clearly invalid.

I have no idea how this might be happening. My best guess is that 'ddidx' in
handle_stripe_expansion is getting a bad value but I cannot see how that
would happen.

If you have reasonable backups you could  try again and see if it still fails.
Maybe it was a one-off.

Not sure what else to suggest.  It might be fixed in a newer kernel, or it
might not...

NeilBrown


Attachment: signature.asc
Description: PGP signature


[Index of Archives]     [Linux RAID Wiki]     [ATA RAID]     [Linux SCSI Target Infrastructure]     [Linux Block]     [Linux IDE]     [Linux SCSI]     [Linux Hams]     [Device Mapper]     [Device Mapper Cryptographics]     [Kernel]     [Linux Admin]     [Linux Net]     [GFS]     [RPM]     [git]     [Yosemite Forum]


  Powered by Linux