> -----Original Message----- > From: NeilBrown [mailto:neilb@xxxxxxx] > Sent: Thursday, December 08, 2011 5:02 AM > To: Kwolek, Adam > Cc: linux-raid@xxxxxxxxxxxxxxx; Ciechanowski, Ed; Labun, Marcin; Williams, > Dan J > Subject: Re: [PATCH] md: Add ability for disable bad block management > > On Wed, 7 Dec 2011 11:10:06 +0000 "Kwolek, Adam" > <adam.kwolek@xxxxxxxxx> > wrote: > > > > > > > > -----Original Message----- > > > From: NeilBrown [mailto:neilb@xxxxxxx] > > > > I cannot reproduce this. > > > I didn't physically remove devices, but I used > > > echo 1 > /sys/block/sdc/device/delete which should be nearly > > > identical from the perspective of md and mdadm. > > > > I've checked that when I'm deleting device using sysfs everything works > perfect. > > When when device is pulled out, reshape stops in md/mdstat. > > > > > If you could give me the exact set of steps that you follow to > > > produce the problem that would help - maybe a script? Just a description > is OK. > > > > > > #used disks sdb, sdc, sdd, sde > > export IMSM_NO_PLATFORM=1 > > #create container > > mdadm -C /dev/md/imsm0 -amd -e imsm -n 3 /dev/sdb /dev/sdc /dev/sde > -R > > #create vol mdadm -C /dev/md/raid5vol_0 -amd -l 5 --chunk 32 --size > > 104850 -n 3 /dev/sdb /dev/sdc /dev/sde -R #add spare mdadm --add > > /dev/md/imsm0 /dev/sdd #run OLCE mdadm --grow /dev/md/imsm0 > > --raid-devices 4 #when reshape starts, I'm (physically) pulling device > > out > > > > > Also you say it is blocking in md_do_sync. Is that at the > > > > > > wait_event(mddev->recovery_wait, !atomic_read(&mddev- > > > >recovery_active)); > > > > > > call just after the "out:" label? > > > > None of those 2 places. > > It enters sync_request() function. Md_error() is called. > > More is visible on thread stack information below > (md_wait_for_blocked_rdev()). > > > > > > > > > > What is the raid thread doing at this point? > > > cat /proc/PID/stack > > > might help. > > > > [md126_raid5] > > [<ffffffff8121d843>] md_wait_for_blocked_rdev+0xbc/0x10f > > [<ffffffffa01d87ce>] handle_stripe+0x1c5c/0x2c99 [raid456] > > [<ffffffffa01d9d0d>] raid5d+0x502/0x564 [raid456] [<ffffffff8121eca5>] > > md_thread+0x101/0x11f [<ffffffff81049e0e>] kthread+0x81/0x89 > > [<ffffffff812cc4f4>] kernel_thread_helper+0x4/0x10 > > [<ffffffffffffffff>] 0xffffffffffffffff > > > > [md126_reshape] > > [<ffffffffa02455a2>] sync_request+0x90a/0xbfb [raid456] > > [<ffffffff8121e151>] md_do_sync+0x7aa/0xc40 [<ffffffff8121ecb3>] > > md_thread+0x101/0x11f [<ffffffff81049e0e>] kthread+0x81/0x89 > > [<ffffffff812cc4f4>] kernel_thread_helper+0x4/0x10 > > [<ffffffffffffffff>] 0xffffffffffffffff > > > > > > > > What are the contents of all the sysfs files? > > > grep . /sys/block/mdXXX/md/* > > array_state ->active > > degraded ->1 > > max_read_errors ->20 > > reshape_position ->12288 > > resync_start ->none > > sync_completed ->4096 / 209664 > > > > > > > grep . /sys/block/mdXXX/md/dev-*/* > > > > When removed is sdd /sys/block/mdXXX/md/dev-sdd/* > > bad_blocks ->4096 512 > > ->4608 128 > > ->4736 384 > > block ->MISSING link is not valid > > errors ->0 > > offset ->0 > > recovery_start ->4096 > > size ->104832 > > slot ->3 > > state ->faulty,write_error > > unacknowledged_bad_blocks ->4096 512 > > ->4608 128 > > ->4736 384 > > > > I hope this helps. > > Yes it does, thanks. > > Can you try with this patch as well please. > > Thanks, > NeilBrown > > > diff --git a/drivers/md/raid5.c b/drivers/md/raid5.c index ea6dce9..6cf0f6a > 100644 > --- a/drivers/md/raid5.c > +++ b/drivers/md/raid5.c > @@ -3175,6 +3175,8 @@ static void analyse_stripe(struct stripe_head *sh, > struct stripe_head_state *s) > rdev = rcu_dereference(conf->disks[i].rdev); > clear_bit(R5_ReadRepl, &dev->flags); > } > + if (rdev && test_bit(Faulty, &rdev->flags)) > + rdev = NULL; > if (rdev) { > is_bad = is_badblock(rdev, sh->sector, > STRIPE_SECTORS, > &first_bad, &bad_sectors); I've didn't succeed with this patch only, but when I've switch to newest md from today's neil_for-linus branch things went better. During migration it seems that it is OK. Problems are when during rebuild/resync additional disk is failed (physical pull). Metadata react correctly (mdadm/mdmon) but md stops again. This time: [md126_resync] [<ffffffffa027037d>] get_active_stripe+0x295/0x598 [raid456] [<ffffffffa02757da>] sync_request+0xb1c/0xba7 [raid456] [<ffffffff8121e656>] md_do_sync+0x772/0xbc4 [<ffffffff8121f174>] md_thread+0x101/0x11f [<ffffffff81049ebe>] kthread+0x81/0x89 [<ffffffff812cc934>] kernel_thread_helper+0x4/0x10 [<ffffffffffffffff>] 0xffffffffffffffff Thread [md126_raid5] is missing, but in mdstat raid5 resync/rebuild is visible During initialization one time it was executed correctly, second time it stops exactly as rebuild in get_active_stripe() and [md126_raid5] thread was missing also. Any 'mdadm -Ss' causes system hung (not very surprising without raid5 thread) In /var/log/messages we have: Dec 8 12:39:49 gklab-128-013 kernel: Modules linked in: raid456 async_pq async_xor xor async_memcpy async_raid6_recov raid6_pq async_tx ext2 nvidia(P) snd_pcm_oss snd_mixer_oss snd_seq snd_seq_device ipv6 af_packet cpufreq_conservative cpufreq_userspace cpufreq_powersave acpi_cpufreq mperf microcode fuse loop dm_mod snd_hda_codec_hdmi snd_hda_codec_idt snd_hda_intel snd_hda_codec snd_hwdep snd_pcm snd_timer snd intel_agp iTCO_wdt tpm_tis tpm soundcore e100 pcspkr mii tpm_bios snd_page_alloc sr_mod cdrom serio_raw i2c_i801 i2c_core iTCO_vendor_support sg intel_gtt button agpgart usbhid hid uhci_hcd sd_mod crc_t10dif ehci_hcd usbcore usb_common edd ext3 mbcache jbd fan processor ide_pci_generic ide_core ata_generic ahci libahci pata_marvell libata scsi_mod thermal thermal_sys hwmon Dec 8 12:39:49 gklab-128-013 kernel: Dec 8 12:39:49 gklab-128-013 kernel: Pid: 4584, comm: md126_raid5 Tainted: P 3.2.0-rc1-SLE11_BRANCH_ADK #10 /DP35DP Dec 8 12:39:49 gklab-128-013 kernel: RIP: 0010:[<ffffffffa0280e67>] [<ffffffffa0280e67>] handle_stripe+0x2f5/0x2cbf [raid456] Dec 8 12:39:49 gklab-128-013 kernel: RSP: 0018:ffff8800d61cdb80 EFLAGS: 00010002 Dec 8 12:39:49 gklab-128-013 kernel: RAX: 0000000000008001 RBX: 0000000000000000 RCX: 0000000000000002 Dec 8 12:39:49 gklab-128-013 kernel: RDX: 0000000000000000 RSI: ffff880114462800 RDI: ffff8801144629a8 Dec 8 12:39:49 gklab-128-013 kernel: RBP: ffff8800d61cdd40 R08: ffff8800379256c0 R09: 0000000300000000 Dec 8 12:39:49 gklab-128-013 kernel: R10: ffff88010e5bfa00 R11: 0000000100000001 R12: ffff8800372602c8 Dec 8 12:39:49 gklab-128-013 kernel: R13: ffff880037260048 R14: ffff8800372602d0 R15: ffff8801144638b0 Dec 8 12:39:49 gklab-128-013 kernel: FS: 0000000000000000(0000) GS:ffff88011bc00000(0000) knlGS:0000000000000000 Dec 8 12:39:49 gklab-128-013 kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b Dec 8 12:39:49 gklab-128-013 kernel: CR2: 00000000000000b0 CR3: 00000000379b3000 CR4: 00000000000006f0 Dec 8 12:39:49 gklab-128-013 kernel: DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 Dec 8 12:39:49 gklab-128-013 kernel: DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 Dec 8 12:39:49 gklab-128-013 kernel: Process md126_raid5 (pid: 4584, threadinfo ffff8800d61cc000, task ffff88003715a7c0) Dec 8 12:39:49 gklab-128-013 kernel: Stack: Dec 8 12:39:49 gklab-128-013 kernel: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 Dec 8 12:39:49 gklab-128-013 kernel: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 Dec 8 12:39:49 gklab-128-013 kernel: 0000000000000400 0000000000000400 0000000300000000 ffff88010e749280 Dec 8 12:39:49 gklab-128-013 kernel: Call Trace: Dec 8 12:39:49 gklab-128-013 kernel: [<ffffffff81221fd4>] ? md_check_recovery+0x60d/0x630 Dec 8 12:39:49 gklab-128-013 kernel: [<ffffffffa027ef28>] ? __release_stripe+0x174/0x18f [raid456] Dec 8 12:39:49 gklab-128-013 kernel: [<ffffffffa0283d33>] raid5d+0x502/0x564 [raid456] Dec 8 12:39:49 gklab-128-013 kernel: [<ffffffff812c3e6c>] ? schedule_timeout+0x35/0x1e8 Dec 8 12:39:49 gklab-128-013 kernel: [<ffffffff8121f174>] md_thread+0x101/0x11f Dec 8 12:39:49 gklab-128-013 kernel: [<ffffffff8104a2ad>] ? wake_up_bit+0x23/0x23 Dec 8 12:39:49 gklab-128-013 kernel: [<ffffffff8121f073>] ? md_register_thread+0xd6/0xd6 Dec 8 12:39:50 gklab-128-013 kernel: [<ffffffff81049ebe>] kthread+0x81/0x89 Dec 8 12:39:50 gklab-128-013 kernel: [<ffffffff812cc934>] kernel_thread_helper+0x4/0x10 Dec 8 12:39:50 gklab-128-013 kernel: [<ffffffff81049e3d>] ? kthread_worker_fn+0x145/0x145 Dec 8 12:39:50 gklab-128-013 kernel: [<ffffffff812cc930>] ? gs_change+0xb/0xb Dec 8 12:39:50 gklab-128-013 kernel: Code: 75 11 49 8b 45 30 48 83 c0 08 48 3b 83 e0 00 00 00 77 07 f0 41 80 4c 24 08 08 49 8b 44 24 08 66 85 c0 79 2c f0 41 80 64 24 08 f7 Dec 8 12:39:50 gklab-128-013 kernel: <48> 8b 83 b0 00 00 00 a8 02 75 10 c7 45 80 01 00 00 00 f0 ff 83 Dec 8 12:39:50 gklab-128-013 kernel: RIP [<ffffffffa0280e67>] handle_stripe+0x2f5/0x2cbf [raid456] Dec 8 12:39:50 gklab-128-013 kernel: RSP <ffff8800d61cdb80> Dec 8 12:39:50 gklab-128-013 kernel: CR2: 00000000000000b0 The problem is caused by access to just cleaned rdev a few lines below in raid5.c. The following patch corrects it. >From fbaa3fdff634721e5c2c09e07b8429385494ee02 Mon Sep 17 00:00:00 2001 From: Adam Kwolek <adam.kwolek@xxxxxxxxx> Date: Thu, 8 Dec 2011 15:34:09 +0100 Subject: [PATCH] md: raid5 crash during degradation NULL pointer access causes crash in raid5 module. Signed-off-by: Adam Kwolek <adam.kwolek@xxxxxxxxx> --- drivers/md/raid5.c | 4 ++-- 1 files changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/md/raid5.c b/drivers/md/raid5.c index b0dec01..da4997c 100644 --- a/drivers/md/raid5.c +++ b/drivers/md/raid5.c @@ -3070,7 +3070,7 @@ static void analyse_stripe(struct stripe_head *sh, struct stripe_head_state *s) if (sh->sector + STRIPE_SECTORS <= rdev->recovery_offset) set_bit(R5_Insync, &dev->flags); } - if (test_bit(R5_WriteError, &dev->flags)) { + if (test_bit(R5_WriteError, &dev->flags) && rdev) { clear_bit(R5_Insync, &dev->flags); if (!test_bit(Faulty, &rdev->flags)) { s->handle_bad_blocks = 1; @@ -3078,7 +3078,7 @@ static void analyse_stripe(struct stripe_head *sh, struct stripe_head_state *s) } else clear_bit(R5_WriteError, &dev->flags); } - if (test_bit(R5_MadeGood, &dev->flags)) { + if (test_bit(R5_MadeGood, &dev->flags) && rdev) { if (!test_bit(Faulty, &rdev->flags)) { s->handle_bad_blocks = 1; atomic_inc(&rdev->nr_pending); -- 1.6.0.2 Possible that you will have to add something in addition to my simple access blocking patch /some flags logic/ BR Adam -- To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html