RE: F19 raid 1 stuck read only after reboot

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



>> >>/  >/  # cat /proc/mdstat
>> />>/  />/  Personalities : [raid0] [raid1]
>> />>/  />/  md124 : active raid1 sdb[1] sdc[0]
>> />>/  />/        293032960 blocks super external:/md125/0 [2/2] [UU]
>> />>/  />/
>> />>/  />/  md125 : inactive sdb[1](S) sdc[0](S)
>> />>/  />/        6184 blocks super external:imsm
>> />>/  />/
>> />>/  />/  md126 : active raid0 sdd[1] sde[0]
>> />>/  />/        156296192 blocks super external:/md127/0 128k chunks
>> />>/  />/
>> />>/  />/  md127 : inactive sde[1](S) sdd[0](S)
>> />>/  />/        5032 blocks super external:imsm
>> />>/  /
>> />>/  Super something on external devices which are inactive?
>> />/  They are all internal drives. That is the normal good state
>> />/  for the arrays. After a reboot md124 would show auto-read-only
>> />/  instead of active. When in that state I cannot mount the filesystem.
>> /
>> You mean the whole array md124 goes read-only?
> Yes, the entire array.
>> Or are the partitions on
>> it mounted ro?
> Unable to mount any partitions when the array is in auto-read-ony state.

That's what I'd expect because when you mount them, changes would be
made, even if only the mount counter is updated.  I'm not sure if that
counter is updated when you mount a partition read only --- you could
try to do that.
Have to try that next tine I reboot.

>>   Why do you need to rewrite the partition tables?
> Well that the jist of this post. F19 fails to detect something needed
> for the array
> to work properly. Rewriting the partition table just gets things
> working again.

IIUC, rewriting the partition tables in your case means that you are
writing to the RAID volumes created by the on-board fakeraid while
software RAID is using them.  Besides that I won't do such a thing, it
seems as if this somehow makes these volumes writeable and then somehow
the software reconfigures itself and the software raid becomes writable.

Perhaps there is some sort of timing problem.  You could look at the
services files that are involved with bringing up the RAID and see if
you can add or change waiting times.

>> How do you know when any of the physical disks go bad?
> Hopefully smart will pick-up id there is a disk problem.
> So far it reports no errors for either disk member of the array.

"SMART is not a reliable warning system for impending failure
Maybe not, but when I had one of these drives going bad about a year ago.
It did a great job of warning that I had an impending drive failure.
The relocated sector count would jump in small blocks over time.

The intel storage manager I use under Windows currently does not
have any warnings about the drives.

The bios screen that displays array health when the system comes up
shows both arrays as healthy.

Smart,Intel and the bios  were all in agreement when I had a failing disk in the past.

detection."[1]

I wouldn't rely on SMART to decide whether a disk has failed or not.  A
disk either works correctly or it doesn't (letting connection problems
aside).  When it doesn't, just replace it.  SMART is rather irrelevant
for that.

You seem to be using these disks for a couple years, and it is possible
that there is some failure.  It may be a coincidence that a failure is
discovered after you performed the upgrade, and it is also possible that
there already was a failure before you upgraded which is now detected
and simply wasn't detected before.

Do you see any messages in /var/log/messages that could be relevant?
Since you're using the on-board fakeraid, I'm not sure what you would
see and if you'd see anything at all.  Maybe the BIOS tells you or has
an option to test the disks for failures.
These are the only messages that seem relevant from journalctl -xb

Jul 19 07:35:26 knucklehead.xxxx.com kernel: device-mapper: table: 253:4: linear: dm-linear: Device lookup failed
Jul 19 07:35:26 knucklehead.xxxx.com kernel: device-mapper: ioctl: error adding target to table


Do you have good backups?
Yes ........


[1]: http://www.tomshardware.com/reviews/ssd-reliability-failure-rate,2923-9.html

Alan

-- 
users mailing list
users@xxxxxxxxxxxxxxxxxxxxxxx
To unsubscribe or change subscription options:
https://admin.fedoraproject.org/mailman/listinfo/users
Guidelines: http://fedoraproject.org/wiki/Mailing_list_guidelines
Have a question? Ask away: http://ask.fedoraproject.org
[Index of Archives]     [Older Fedora Users]     [Fedora Announce]     [Fedora Package Announce]     [EPEL Announce]     [EPEL Devel]     [Fedora Magazine]     [Fedora Summer Coding]     [Fedora Laptop]     [Fedora Cloud]     [Fedora Advisory Board]     [Fedora Education]     [Fedora Security]     [Fedora Scitech]     [Fedora Robotics]     [Fedora Infrastructure]     [Fedora Websites]     [Anaconda Devel]     [Fedora Devel Java]     [Fedora Desktop]     [Fedora Fonts]     [Fedora Marketing]     [Fedora Management Tools]     [Fedora Mentors]     [Fedora Package Review]     [Fedora R Devel]     [Fedora PHP Devel]     [Kickstart]     [Fedora Music]     [Fedora Packaging]     [Fedora SELinux]     [Fedora Legal]     [Fedora Kernel]     [Fedora OCaml]     [Coolkey]     [Virtualization Tools]     [ET Management Tools]     [Yum Users]     [Yosemite News]     [Gnome Users]     [KDE Users]     [Fedora Art]     [Fedora Docs]     [Fedora Sparc]     [Libvirt Users]     [Fedora ARM]

  Powered by Linux