Re: [BUG] Raid5 trouble

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



BERTRAND Joël wrote:
    Hello,

I run 2.6.23 linux kernel on two T1000 (sparc64) servers. Each server has a partitionable raid5 array (/dev/md/d0) and I have to synchronize both raid5 volumes by raid1. Thus, I have tried to build a raid1 volume between /dev/md/d0p1 and /dev/sdi1 (exported by iscsi from the second server) and I obtain a BUG :

Root gershwin:[/usr/scripts] > mdadm -C /dev/md7 -l1 -n2 /dev/md/d0p1 /dev/sdi1
...

	Hello,

I have fixed iscsi-target, and I have tested it. It works now without any trouble. Patches were posted on iscsi-target mailing list. When I use iSCSI to access to foreign raid5 volume, it works fine. I can format foreign volume, copy large files on it... But when I tried to create a new raid1 volume with a local raid5 volume and a foreign raid5 volume, I receive my well known Oops. You can find my dmesg after Oops :

md: md_d0 stopped.
md: bind<sdd1>
md: bind<sde1>
md: bind<sdf1>
md: bind<sdg1>
md: bind<sdh1>

md: bind<sdc1>
raid5: device sdc1 operational as raid disk 0
raid5: device sdh1 operational as raid disk 5
raid5: device sdg1 operational as raid disk 4
raid5: device sdf1 operational as raid disk 3
raid5: device sde1 operational as raid disk 2
raid5: device sdd1 operational as raid disk 1
raid5: allocated 12518kB for md_d0
raid5: raid level 5 set md_d0 active with 6 out of 6 devices, algorithm 2
RAID5 conf printout:
 --- rd:6 wd:6
 disk 0, o:1, dev:sdc1
 disk 1, o:1, dev:sdd1
 disk 2, o:1, dev:sde1
 disk 3, o:1, dev:sdf1
 disk 4, o:1, dev:sdg1
 disk 5, o:1, dev:sdh1
 md_d0: p1
scsi3 : iSCSI Initiator over TCP/IP
scsi 3:0:0:0: Direct-Access     IET      VIRTUAL-DISK     0    PQ: 0 ANSI: 4
sd 3:0:0:0: [sdi] 2929451520 512-byte hardware sectors (1499879 MB)
sd 3:0:0:0: [sdi] Write Protect is off
sd 3:0:0:0: [sdi] Mode Sense: 77 00 00 08
sd 3:0:0:0: [sdi] Write cache: disabled, read cache: enabled, doesn't support DPO or FUA
sd 3:0:0:0: [sdi] 2929451520 512-byte hardware sectors (1499879 MB)
sd 3:0:0:0: [sdi] Write Protect is off
sd 3:0:0:0: [sdi] Mode Sense: 77 00 00 08
sd 3:0:0:0: [sdi] Write cache: disabled, read cache: enabled, doesn't support DPO or FUA
 sdi: sdi1
sd 3:0:0:0: [sdi] Attached SCSI disk
md: bind<md_d0p1>
md: bind<sdi1>
md: md7: raid array is not clean -- starting background reconstruction
raid1: raid set md7 active with 2 out of 2 mirrors
md: resync of RAID array md7
md: minimum _guaranteed_  speed: 1000 KB/sec/disk.
md: using maximum available idle IO bandwidth (but not more than 200000 KB/sec) for resync.
md: using 256k window, over a total of 1464725632 blocks.
kernel BUG at drivers/md/raid5.c:380!
              \|/ ____ \|/
              "@'/ .. \`@"
              /_| \__/ |_\
                 \__U_/
md7_resync(4929): Kernel bad sw trap 5 [#1]
TSTATE: 0000000080001606 TPC: 00000000005ed50c TNPC: 00000000005ed510 Y: 00000000 Not tainted
TPC: <get_stripe_work+0x1f4/0x200>
g0: 0000000000000005 g1: 00000000007c0400 g2: 0000000000000001 g3: 0000000000748400 g4: fffff800feeb6880 g5: fffff80002080000 g6: fffff800e7598000 g7: 0000000000748528 o0: 0000000000000029 o1: 0000000000715798 o2: 000000000000017c o3: 0000000000000005 o4: 0000000000000006 o5: fffff800e8f0a060 sp: fffff800e759ad81 ret_pc: 00000000005ed504
RPC: <get_stripe_work+0x1ec/0x200>
l0: 0000000000000002 l1: ffffffffffffffff l2: fffff800e8f0a0a0 l3: fffff800e8f09fe8 l4: fffff800e8f0a088 l5: fffffffffffffff8 l6: 0000000000000005 l7: fffff800e8374000 i0: fffff800e8f0a028 i1: 0000000000000000 i2: 0000000000000004 i3: fffff800e759b720 i4: 0000000000000080 i5: 0000000000000080 i6: fffff800e759ae51 i7: 00000000005f0274
I7: <handle_stripe5+0x4fc/0x1340>
Caller[00000000005f0274]: handle_stripe5+0x4fc/0x1340
Caller[00000000005f211c]: handle_stripe+0x24/0x13e0
Caller[00000000005f4450]: make_request+0x358/0x600
Caller[0000000000542890]: generic_make_request+0x198/0x220
Caller[00000000005eb240]: sync_request+0x608/0x640
Caller[00000000005fef7c]: md_do_sync+0x384/0x920
Caller[00000000005ff8f0]: md_thread+0x38/0x140
Caller[0000000000478b40]: kthread+0x48/0x80
Caller[00000000004273d0]: kernel_thread+0x38/0x60
Caller[0000000000478de0]: kthreadd+0x148/0x1c0
Instruction DUMP: 9210217c 7ff8f57f 90122398 <91d02005> 30680004 01000000 01000000 01000000 9de3bf00

	I suspect a major bug in raid5 code but I don't know how debug it...

md7 was crated by mdadm -C /dev/md7 -l1 -n2 /dev/md/d0 /dev/sdi1. /dev/md/d0 is a raid5 volume, and sdi a iSCSI disk.

	Regards,

	JKB
-
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux RAID Wiki]     [ATA RAID]     [Linux SCSI Target Infrastructure]     [Linux Block]     [Linux IDE]     [Linux SCSI]     [Linux Hams]     [Device Mapper]     [Device Mapper Cryptographics]     [Kernel]     [Linux Admin]     [Linux Net]     [GFS]     [RPM]     [git]     [Yosemite Forum]


  Powered by Linux