Re: Raid-6 won't boot

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Yes, I had added a drive and it was busy copying data to the new drive
when the reshape slowed down gradually, and eventually the system locked
up.  I didn't change raid configurations or anything like that - just
added a drive.  I didn't use any external files, so not sure if i'd be
able to recover any... i suspect not...

thanks,
allie

On 3/31/2020 5:16 PM, Roger Heflin wrote:
> were you doing a reshape when it was rebooted?    And if so did you
> have to use an external file when doing the reshape and were was that
> file?   I think there is a command to restart a reshape using an
> external file.
> 
> On Tue, Mar 31, 2020 at 11:13 AM Alexander Shenkin <al@xxxxxxxxxxx> wrote:
>>
>> quick followup: trying a stop and assemble results in the message that
>> it "Failed to restore critical section for reshape, sorry".
>>
>> On 3/31/2020 11:08 AM, Alexander Shenkin wrote:
>>> Thanks Roger,
>>>
>>> It seems only the Raid1 module is loaded.  I didn't find a
>>> straightforward way to get that module loaded... any suggestions?  Or,
>>> will I have to find another livecd that contains raid456?
>>>
>>> Thanks,
>>> Allie
>>>
>>> On 3/30/2020 9:45 PM, Roger Heflin wrote:
>>>> They all seem to be there, all seem to report all 7 disks active, so
>>>> it does not appear to be degraded. All event counters are the same.
>>>> Something has to be causing them to not be scanned and assembled at
>>>> all.
>>>>
>>>> Is the rescue disk a similar OS to what you have installed?  If it is
>>>> you might try a random say fedora livecd and see if it acts any
>>>> different.
>>>>
>>>> what does fdisk -l /dev/sda look like?
>>>>
>>>> Is the raid456 module loaded (lsmod | grep raid)?
>>>>
>>>> what does cat /proc/cmdline look like?
>>>>
>>>> you might also run this:
>>>> file -s /dev/sd*3
>>>> But I think it is going to show us the same thing as what the mdadm
>>>> --examine is reporting.
>>>>
>>>> On Mon, Mar 30, 2020 at 3:05 PM Alexander Shenkin <al@xxxxxxxxxxx> wrote:
>>>>>
>>>>> See attached.  I should mention that the last drive i added is on a new
>>>>> controller that is separate from the other drives, but seemed to work
>>>>> fine for a bit, so kinda doubt that's the issue...
>>>>>
>>>>> thanks,
>>>>>
>>>>> allie
>>>>>
>>>>> On 3/30/2020 6:21 PM, Roger Heflin wrote:
>>>>>> do this against each partition that had it:
>>>>>>
>>>>>>  mdadm --examine /dev/sd***
>>>>>>
>>>>>> It seems like it is not seeing it as a md-raid.
>>>>>>
>>>>>> On Mon, Mar 30, 2020 at 11:13 AM Alexander Shenkin <al@xxxxxxxxxxx> wrote:
>>>>>>> Thanks Roger,
>>>>>>>
>>>>>>> The only line that isn't commented out in /etc/mdadm.conf is "DEVICE
>>>>>>> partitions"...
>>>>>>>
>>>>>>> Thanks,
>>>>>>>
>>>>>>> Allie
>>>>>>>
>>>>>>> On 3/30/2020 4:53 PM, Roger Heflin wrote:
>>>>>>>> That seems really odd.  Is the raid456 module loaded?
>>>>>>>>
>>>>>>>> On mine I see messages like this for each disk it scanned and
>>>>>>>> considered as maybe possibly being an array member.
>>>>>>>>  kernel: [   83.468700] md/raid:md13: device sdi3 operational as raid disk 5
>>>>>>>> and messages like this:
>>>>>>>>  md/raid:md14: not clean -- starting background reconstruction
>>>>>>>>
>>>>>>>> You might look at /etc/mdadm.conf on the rescue cd and see if it has a
>>>>>>>> DEVICE line that limits what is being scanned.
>>>>>>>>
>>>>>>>> On Mon, Mar 30, 2020 at 10:13 AM Alexander Shenkin <al@xxxxxxxxxxx> wrote:
>>>>>>>>> Thanks Roger,
>>>>>>>>>
>>>>>>>>> that grep just returns the detection of the raid1 (md127).  See dmesg
>>>>>>>>> and mdadm --detail results attached.
>>>>>>>>>
>>>>>>>>> Many thanks,
>>>>>>>>> allie
>>>>>>>>>
>>>>>>>>> On 3/28/2020 1:36 PM, Roger Heflin wrote:
>>>>>>>>>> Try this grep:
>>>>>>>>>> dmesg | grep "md/raid", if that returns nothing if you can just send
>>>>>>>>>> the entire dmesg.
>>>>>>>>>>
>>>>>>>>>> On Sat, Mar 28, 2020 at 2:47 AM Alexander Shenkin <al@xxxxxxxxxxx> wrote:
>>>>>>>>>>> Thanks Roger.  dmesg has nothing in it referring to md126 or md127....
>>>>>>>>>>> any other thoughts on how to investigate?
>>>>>>>>>>>
>>>>>>>>>>> thanks,
>>>>>>>>>>> allie
>>>>>>>>>>>
>>>>>>>>>>> On 3/27/2020 3:55 PM, Roger Heflin wrote:
>>>>>>>>>>>> A non-assembled array always reports raid1.
>>>>>>>>>>>>
>>>>>>>>>>>> I would run "dmesg | grep md126" to start with and see what it reports it saw.
>>>>>>>>>>>>
>>>>>>>>>>>> On Fri, Mar 27, 2020 at 10:29 AM Alexander Shenkin <al@xxxxxxxxxxx> wrote:
>>>>>>>>>>>>> Thanks Wol,
>>>>>>>>>>>>>
>>>>>>>>>>>>> Booting in SystemRescueCD and looking in /proc/mdstat, two arrays are
>>>>>>>>>>>>> reported.  The first (md126) in reported as inactive with all 7 disks
>>>>>>>>>>>>> listed as spares.  The second (md127) is reported as active
>>>>>>>>>>>>> auto-read-only with all 7 disks operational.  Also, the only
>>>>>>>>>>>>> "personality" reported is Raid1.  I could go ahead with your suggestion
>>>>>>>>>>>>> of mdadm --stop array and then mdadm --assemble, but I thought the
>>>>>>>>>>>>> reporting of just the Raid1 personality was a bit strange, so wanted to
>>>>>>>>>>>>> check in before doing that...
>>>>>>>>>>>>>
>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>> Allie
>>>>>>>>>>>>>
>>>>>>>>>>>>> On 3/26/2020 10:00 PM, antlists wrote:
>>>>>>>>>>>>>> On 26/03/2020 17:07, Alexander Shenkin wrote:
>>>>>>>>>>>>>>> I surely need to boot with a rescue disk of some sort, but from there,
>>>>>>>>>>>>>>> I'm not sure exactly when I should do.  Any suggestions are very welcome!
>>>>>>>>>>>>>> Okay. Find a liveCD that supports raid (hopefully something like
>>>>>>>>>>>>>> SystemRescueCD). Make sure it has a very recent kernel and the latest
>>>>>>>>>>>>>> mdadm.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> All being well, the resync will restart, and when it's finished your
>>>>>>>>>>>>>> system will be fine. If it doesn't restart on its own, do an "mdadm
>>>>>>>>>>>>>> --stop array", followed by an "mdadm --assemble"
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> If that doesn't work, then
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> https://raid.wiki.kernel.org/index.php/Linux_Raid#When_Things_Go_Wrogn
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Cheers,
>>>>>>>>>>>>>> Wol



[Index of Archives]     [Linux RAID Wiki]     [ATA RAID]     [Linux SCSI Target Infrastructure]     [Linux Block]     [Linux IDE]     [Linux SCSI]     [Linux Hams]     [Device Mapper]     [Device Mapper Cryptographics]     [Kernel]     [Linux Admin]     [Linux Net]     [GFS]     [RPM]     [git]     [Yosemite Forum]


  Powered by Linux