RE: lpfc RAID1 device panics when one device goes away

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Title: RE: lpfc RAID1 device panics when one device goes away

Mark,

I may be wrong here and maybe someone out there knows better, but I don't think this will work without PowerPath.  That allows your OS to treat both your HBA's as one.  And it load balances across the two HBA's.  Without that you have two independent connections to two LUNs and that is what is causing the panic.  You need something that will treat both your connections as one connection.  Even if both your HBA's can talk to both LUNs the OS is not going to fail over to the one that is working without some sort of go-between, and the kernel does not know it can talk to both LUNs via either HBA.  It just knows that it had 2 connections to the raid and one of them is gone so the raid is no longer available.  At least that is the way it would seem to work to me.

My 2 cents.  Let me know if you find out something different though.

Drew

-----Original Message-----
From: Bruen, Mark [mailto:mbruen@xxxxxxxxxxxxxx]
Sent: Friday, January 30, 2004 8:54 AM
To: redhat-list@xxxxxxxxxx
Subject: Re: lpfc RAID1 device panics when one device goes away


No, it worked once but then on the next test panic'd again, I'll keep
looking.
        -Mark

Hamilton Andrew wrote:
> Did that fix it?  I have an EMC CX600 configured much the same way, but
> I'm using RHEL 2.1AS instead of 3.0.  I'm sure there are a ton of
> differences between the two distro's.
>
> -----Original Message-----
> From: Bruen, Mark [mailto:mbruen@xxxxxxxxxxxxxx]
> Sent: Wednesday, January 28, 2004 7:09 PM
> To: redhat-list@xxxxxxxxxx
> Subject: Re: lpfc RAID1 device panics when one device goes away
>
>
> I think I have fixed this by changing the partition type of each LUN's
> (disk)
> partition to "fd" (Linux raid auto).
>
> Bruen, Mark wrote:
>  > That will be the config once Veritas and/or EMC support HBA path
>  > failover on RedHat AS 3.0. Veritas will support it with DMP in version 4
>  > due in Q2/04, EMC has not committed to a date yet with PowerPath. In the
>  > interim I'm trying to provide path failover using software RAID1 of two
>  > hardware RAID5 LUNs one on each path (two switches connected to two
>  > storage processors connected to two HBAs per server).
>  >     -Mark
>  >
>  > Hamilton Andrew wrote:
>  >
>  >> What's your SAN?  Why don't you configure your raid1 on the SAN and
>  >> let it publish that raid group as 1 LUN?  Are you using a any kind of
>  >> fibre switch between your cards and your SAN?
>  >>
>  >> Drew
>  >>
>  >> -----Original Message-----
>  >> From: Bruen, Mark [mailto:mbruen@xxxxxxxxxxxxxx]
>  >> Sent: Wednesday, January 28, 2004 3:28 PM
>  >> To: redhat-list@xxxxxxxxxx
>  >> Subject: lpfc RAID1 device panics when one device goes away
>  >>
>  >>
>  >> I'm running RedHat AS 3.0 kernel 2.4.21-4.ELsmp on a Dell 1750 with 2
>  >> Emulex
>  >> LP9002DC-E HBAs. I've configured a RAID1 device called /dev/md10 from
>  >> 2 SAN
>  >> based LUNs /dev/sdc and /dev/sde. Everything works fine until I
>  >> disable one of
>  >> the HBA paths to the disk. Here's the console output:
>  >> [root@reacher root]# !lpfc1:1306:LKe:Link Down Event received Data: x2
>  >> x2 x0 x20
>  >>   I/O error: dev 08:40, sector 69792
>  >> raid1: Disk failure on sde, disabling device.
>  >>          Operation continuing on 1 devices
>  >> md10: vno@ pspar2e! d?i@
>  >>                          s@kq tAo rec@oqnAst`rIu/Oc
>  >>                                                    t AaqArra@qyA!@
>  >> -v-@ cpont
>  >> inI/uOinhgr oihn de_g_r_a_m@vqA@`@ 70288
>  >>   I/O error: dev 08`I/O sector 70536
>  >>   I/O error: dev 08:40, sector 70784
>  >>   I/O error: dev 08:40, sector 71032
>  >>   I/O error: dev 08:40, sector 71280
>  >>   I/O error@qA@v@p2!?@
>  >>                       AqA@qA`I/O
>  >>                                 BqA@qA@v@p I/Oh 7h____mv@`dev 08:40,
>  >> sector 72024
>  >>   `I/Oerror: dev 08:40, sector 72272
>  >>   I/O error: dev 08:40, sector 72520
>  >>   I/O error: dev 08:40, sector 72768
>  >>   I/O error: dev 08:40, sector 73@qA@v@p2!?@
>  >>                                             BqA@qA`I/O
>  >>                                                       CqA@qA@v@p
>  >> I/Ohdeh____mv@`2
>  >>   I/O error: dev 08:40, `I/Oor 73760
>  >>   I/O error: dev 08:40, sector 74008
>  >>   I/O error: dev 08:40, sector 74256
>  >>   I/O error: dev 08:40, sector 74504
>  >>   I/O error: dev@qA@v@p2!?@
>  >>                            CqA@qA`I/O
>  >>                                      DqA@qA@v@p I/Oh0
>  >> h____mv@`8:40, sector 75248
>  >>   I/O e`I/O: dev 08:40, sector 75496
>  >>   I/O error: dev 08:40, sector 75744
>  >>   I/O error: dev 08:40, sector 75992
>  >>   I/O error: dev 08:40, sector 76240
>  >> <@qA@v@p2!?@
>  >>              DqA@qA`I/O
>  >>                        EqA@qA@v@p I/Oh8:h____mv@` I/O error: dev 08:40,
>  >> secto`I/O984
>  >>   I/O error: dev 08:40, sector 77232
>  >>   I/O error: dev 08:40, sector 77480
>  >>   I/O error: dev 08:40, sector 77728
>  >>   I/O error: dev 08:4@qA@v@p2!?@
>  >>                                 EqA@qA`I/O
>  >>                                           FqA@qA@v@p I/Oh Ih____mv@`
>  >> sector 78352
>  >>   I/O error:`I/O 08:40, sector 78600
>  >>   I/O error: dev 08:40, sector 78848
>  >>   I/O error: dev 08:40, sector 79096
>  >>   I/O error: dev 08:40, sector 79344
>  >>   I/@qA@v@p2!?@
>  >>                FqA@qA`I/O
>  >>                          GqA@qA@v@p I/Oh sh____mv@`error: dev 08:40,
>  >> sector
>  >> 800`I/O4> I/O error: dev 08:40, sector 80336
>  >>   I/O error: dev 08:40, sector 80584
>  >>   I/O error: dev 08:40, sector 80832
>  >>   I/O error: dev 08:40, se@qA@v@p2!?@
>  >>                                      GqA@qA`I/O
>  >>                                                HqA@qA@v@p
>  >> I/Oherh____mv@`or 81576
>  >>   I/O error: dev `I/O0, sector 81824
>  >>   I/O error: dev 08:40, sector 82072
>  >>   I/O error: dev 08:40, sector 82320
>  >>   I/O error: dev 08:40, sector 82568
>  >>   I/O err@qA@v@p2!?@
>  >>                     HqA@qA`I/O
>  >>                               IqA@qA@v@p I/Ohorh____mv@`: dev 08:40,
>  >> sector 83312
>  >> <4`I/OO error: dev 08:40, sector 83560
>  >>   I/O error: dev 08:40, sector 83808
>  >>   I/O error: dev 08:40, sector 84056
>  >> Unable to handle kernel paging request at virtual address a0fb8488
>  >>   printing eip:
>  >> c011f694
>  >> *pde = 00000000
>  >> Oops: 0000
>  >> lp parport autofs tg3 floppy microcode keybdev mousedev hid input
>  >> usb-ohci
>  >> usbcore ext3 jbd raid1 raid0 lpfcdd mptscsih mptbase sd_mod scsi_mod
>  >> CPU:    -1041286984
>  >> EIP:    0060:[<c011f694>]    Not tainted
>  >> EFLAGS: 00010087
>  >>
>  >> EIP is at do_page_fault [kernel] 0x54 (2.4.21-4.ELsmp)
>  >> eax: f55ac544   ebx: f55ac544   ecx: a0fb8488   edx: e0b3c000
>  >> esi: c1ef4000   edi: c011f640   ebp: 000000f0   esp: c1ef40c0
>  >> ds: 0068   es: 0068   ss: 0068
>  >> Process Dmu (pid: 0, stackpage=c1ef3000)
>  >> Stack: 00000000 00000002 022c1008 c1eeee4c c1eff274 00000000 00000000
>  >> a0fb8488
>  >>         c17c4520 f58903f4 00000000 c1efd764 c1eee5fc f7fe53c4 00030001
>  >> 00000000
>  >>         00000002 022c100c c1efd780 c1eeba44 00000000 00000000 00000003
>  >> c1b968ec
>  >> Call Trace:   [<c011f640>] do_page_fault [kernel] 0x0 (0xc1ef4178)
>  >> [<c011f640>] do_page_fault [kernel] 0x0 (0xc1ef419c)
>  >> [<c011f694>] do_page_fault [kernel] 0x54 (0xc1ef41b4)
>  >> [<c011f640>] do_page_fault [kernel] 0x0 (0xc1ef4278)
>  >> [<c011f640>] do_page_fault [kernel] 0x0 (0xc1ef429c)
>  >> [<c011f694>] do_page_fault [kernel] 0x54 (0xc1ef42b4)
>  >> [<c011f640>] do_page_fault [kernel] 0x0 (0xc1ef4378)
>  >> [<c011f640>] do_page_fault [kernel] 0x0 (0xc1ef439c)
>  >> [<c011f694>] do_page_fault [kernel] 0x54 (0xc1ef43b4)
>  >> [<c011f640>] do_page_fault [kernel] 0x0 (0xc1ef4478)
>  >> [<c011f640>] do_page_fault [kernel] 0x0 (0xc1ef449c)
>  >> [<c011f694>] do_page_fault [kernel] 0x54 (0xc1ef44b4)
>  >> [<c011f640>] do_page_fault [kernel] 0x0 (0xc1ef4578)
>  >> [<c011f640>] do_page_fault [kernel] 0x0 (0xc1ef459c)
>  >> [<c011f694>] do_page_fault [kernel] 0x54 (0xc1ef45b4)
>  >> [<c011f640>] do_page_fault [kernel] 0x0 (0xc1ef4678)
>  >> [<c011f640>] do_page_fault [kernel] 0x0 (0xc1ef469c)
>  >> [<c011f694>] do_page_fault [kernel] 0x54 (0xc1ef46b4)
>  >> [<c011f640>] do_page_fault [kernel] 0x0 (0xc1ef4778)
>  >> [<c011f640>] do_page_fault [kernel] 0x0 (0xc1ef479c)
>  >> [<c011f694>] do_page_fault [kernel] 0x54 (0xc1ef47b4)
>  >> [<c011f640>] do_page_fault [kernel] 0x0 (0xc1ef4878)
>  >> [<c011f640>] do_page_fault [kernel] 0x0 (0xc1ef489c)
>  >> [<c011f694>] do_page_fault [kernel] 0x54 (0xc1ef48b4)
>  >> [<c011f640>] do_page_fault [kernel] 0x0 (0xc1ef4978)
>  >> [<c011f640>] do_page_fault [kernel] 0x0 (0xc1ef499c)
>  >> [<c011f694>] do_page_fault [kernel] 0x54 (0xc1ef49b4)
>  >> [<c011f640>] do_page_fault [kernel] 0x0 (0xc1ef4a78)
>  >> [<c011f640>] do_page_fault [kernel] 0x0 (0xc1ef4a9c)
>  >> [<c011f694>] do_page_fault [kernel] 0x54 (0xc1ef4ab4)
>  >> [<c011f640>] do_page_fault [kernel] 0x0 (0xc1ef4b78)
>  >> [<c011f640>] do_page_fault [kernel] 0x0 (0xc1ef4b9c)
>  >> [<c011f694>] do_page_fault [kernel] 0x54 (0xc1ef4bb4)
>  >> [<c011f640>] do_page_fault [kernel] 0x0 (0xc1ef4c78)
>  >> [<c011f640>] do_page_fault [kernel] 0x0 (0xc1ef4c9c)
>  >> [<c011f694>] do_page_fault [kernel] 0x54 (0xc1ef4cb4)
>  >> [<c011f640>] do_page_fault [kernel] 0x0 (0xc1ef4d78)
>  >> [<c011f640>] do_page_fault [kernel] 0x0 (0xc1ef4d9c)
>  >> [<c011f694>] do_page_fault [kernel] 0x54 (0xc1ef4db4)
>  >> [<c011f640>] do_page_fault [kernel] 0x0 (0xc1ef4e78)
>  >> [<c011f640>] do_page_fault [kernel] 0x0 (0xc1ef4e9c)
>  >> [<c011f694>] do_page_fault [kernel] 0x54 (0xc1ef4eb4)
>  >> [<c011f640>] do_page_fault [kernel] 0x0 (0xc1ef4f78)
>  >> [<c011f640>] do_page_fault [kernel] 0x0 (0xc1ef4f9c)
>  >> [<c011f694>] do_page_fault [kernel] 0x54 (0xc1ef4fb4)
>  >> [<c011f640>] do_page_fault [kernel] 0x0 (0xc1ef5078)
>  >> [<c011f640>] do_page_fault [kernel] 0x0 (0xc1ef509c)
>  >> [<c011f694>] do_page_fault [kernel] 0x54 (0xc1ef50b4)
>  >> [<c011f640>] do_page_fault [kernel] 0x0 (0xc1ef5178)
>  >> [<c011f640>] do_page_fault [kernel] 0x0 (0xc1ef519c)
>  >> [<c011f694>] do_page_fault [kernel] 0x54 (0xc1ef51b4)
>  >> [<c011f640>] do_page_fault [kernel] 0x0 (0xc1ef5278)
>  >> [<c011f640>] do_page_fault [kernel] 0x0 (0xc1ef529c)
>  >> [<c011f694>] do_page_fault [kernel] 0x54 (0xc1ef52b4)
>  >> [<c011f640>] do_page_fault [kernel] 0x0 (0xc1ef5378)
>  >> [<c011f640>] do_page_fault [kernel] 0x0 (0xc1ef539c)
>  >> [<c011f694>] do_page_fault [kernel] 0x54 (0xc1ef53b4)
>  >> [<c011f640>] do_page_fault [kernel] 0x0 (0xc1ef5478)
>  >> [<c011f640>] do_page_fault [kernel] 0x0 (0xc1ef549c)
>  >> [<c011f694>] do_page_fault [kernel] 0x54 (0xc1ef54b4)
>  >> [<c011f640>] do_page_fault [kernel] 0x0 (0xc1ef5578)
>  >> [<c011f640>] do_page_fault [kernel] 0x0 (0xc1ef559c)
>  >> [<c011f694>] do_page_fault [kernel] 0x54 (0xc1ef55b4)
>  >> [<c011f640>] do_page_fault [kernel] 0x0 (0xc1ef5678)
>  >> [<c011f640>] do_page_fault [kernel] 0x0 (0xc1ef569c)
>  >> [<c011f694>] do_page_fault [kernel] 0x54 (0xc1ef56b4)
>  >> [<c011f640>] do_page_fault [kernel] 0x0 (0xc1ef5778)
>  >> [<c011f640>] do_page_fault [kernel] 0x0 (0xc1ef579c)
>  >> [<c011f694>] do_page_fault [kernel] 0x54 (0xc1ef57b4)
>  >> [<c011f640>] do_page_fault [kernel] 0x0 (0xc1ef5878)
>  >> [<c011f640>] do_page_fault [kernel] 0x0 (0xc1ef589c)
>  >> [<c011f694>] do_page_fault [kernel] 0x54 (0xc1ef58b4)
>  >> [<c011f640>] do_page_fault [kernel] 0x0 (0xc1ef5978)
>  >> [<c011f640>] do_page_fault [kernel] 0x0 (0xc1ef599c)
>  >> [<c011f694>] do_page_fault [kernel] 0x54 (0xc1ef59b4)
>  >> [<c011f640>] do_page_fault [kernel] 0x0 (0xc1ef5a78)
>  >> [<c011f640>] do_page_fault [kernel] 0x0 (0xc1ef5a9c)
>  >> [<c011f694>] do_page_fault [kernel] 0x54 (0xc1ef5ab4)
>  >> [<c011f640>] do_page_fault [kernel] 0x0 (0xc1ef5b78)
>  >> [<c011f640>] do_page_fault [kernel] 0x0 (0xc1ef5b9c)
>  >> [<c011f694>] do_page_fault [kernel] 0x54 (0xc1ef5bb4)
>  >> [<c011f640>] do_page_fault [kernel] 0x0 (0xc1ef5c78)
>  >> [<c011f640>] do_page_fault [kernel] 0x0 (0xc1ef5c9c)
>  >> [<c011f694>] do_page_fault [kernel] 0x54 (0xc1ef5cb4)
>  >> [<c011f640>] do_page_fault [kernel] 0x0 (0xc1ef5d78)
>  >> [<c011f640>] do_page_fault [kernel] 0x0 (0xc1ef5d9c)
>  >> [<c011f694>] do_page_fault [kernel] 0x54 (0xc1ef5db4)
>  >> [<c011f640>] do_page_fault [kernel] 0x0 (0xc1ef5e78)
>  >> [<c011f640>] do_page_fault [kernel] 0x0 (0xc1ef5e9c)
>  >> [<c011f694>] do_page_fault [kernel] 0x54 (0xc1ef5eb4)
>  >> [<c011f640>] do_page_fault [kernel] 0x0 (0xc1ef5f78)
>  >> [<c011f640>] do_page_fault [kernel] 0x0 (0xc1ef5f9c)
>  >> [<c011f694>] do_page_fault [kernel] 0x54 (0xc1ef5fb4)
>  >>
>  >> Code: 8b 82 88 c4 47 c0 8b ba 84 c4 47 c0 01 f8 85 c0 0f 85 46 01
>  >>
>  >> Kernel panic: Fatal exception
>  >>
>  >> Any Ideas?
>  >> Thanks.
>  >>         -Mark
>  >>
>  >>
>  >> --
>  >> redhat-list mailing list
>  >> unsubscribe mailto:redhat-list-request@xxxxxxxxxx?subject=unsubscribe
>  >> https://www.redhat.com/mailman/listinfo/redhat-list
>  >>
>  >
>
> --
> What information consumes is rather obvious: it consumes the attention of
> it's recipients. Hence a wealth of information creates a poverty of
> attention..."
> -Herbert Simon, Computer Scientist, Psychologist, Nobel Laureate
> in Economics, 1978
>
> The information in this electronic mail message is Trilegiant
> Confidential and may be legally privileged. It is intended solely for
> the addressee(s). Access to this Internet electronic mail message by
> anyone else is unauthorized. If you are not the intended recipient, any
> disclosure, copying, distribution or action taken or omitted to be
> taken in reliance on it is prohibited and may be unlawful.
>
> The sender believes that this E-mail and any attachments were free of
> any virus, worm, Trojan horse, and/or malicious code when sent. This
> message and its attachments could have been infected during transmission.
> By reading the message and opening any attachments, the recipient accepts
> full responsibility for taking protective and remedial action about
> viruses and other defects. Trilegiant Corporation is not liable for any
> loss or damage arising in any way from this message or its attachments.
>
>
> --
> redhat-list mailing list
> unsubscribe mailto:redhat-list-request@xxxxxxxxxx?subject=unsubscribe
> https://www.redhat.com/mailman/listinfo/redhat-list
>

--
What information consumes is rather obvious: it consumes the attention
of it's recipients. Hence a wealth of information creates a poverty of
attention..."
-Herbert Simon, Computer Scientist, Psychologist, Nobel Laureate
in Economics, 1978

The information in this electronic mail message is Trilegiant
Confidential and may be legally privileged. It is intended solely for
the addressee(s). Access to this Internet electronic mail message by
anyone else is unauthorized. If you are not the intended recipient, any
disclosure, copying, distribution or action taken or omitted to be
taken in reliance on it is prohibited and may be unlawful.

The sender believes that this E-mail and any attachments were free of
any virus, worm, Trojan horse, and/or malicious code when sent. This
message and its attachments could have been infected during transmission.
By reading the message and opening any attachments, the recipient accepts
full responsibility for taking protective and remedial action about
viruses and other defects. Trilegiant Corporation is not liable for any
loss or damage arising in any way from this message or its attachments.


--
redhat-list mailing list
unsubscribe mailto:redhat-list-request@xxxxxxxxxx?subject=unsubscribe
https://www.redhat.com/mailman/listinfo/redhat-list


[Index of Archives]     [CentOS]     [Kernel Development]     [PAM]     [Fedora Users]     [Red Hat Development]     [Big List of Linux Books]     [Linux Admin]     [Gimp]     [Asterisk PBX]     [Yosemite News]     [Red Hat Crash Utility]


  Powered by Linux