Re: Problem with softwareraid

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]



18. Aug 2017 13:35 by euroregistrar@xxxxxxxxx:


> Hello all,
>
> i have already had a discussion on the software raid mailinglist and i
> want to switch to this one :)
>
> I am having a really strange problem with my md0 device running
> centos7. after a new start of my server the md0 was gone. now after
> trying to find the problem i detected the following:
>
> Booting any installed kernel gives me NO md0 device. (ls /dev/md*
> doesnt give anything). a 'cat /proc/partitions show me now
> /dev/sd[a-d]1 partition. partprobe and a mdadm assemble gives me "disk
> busy"
>
> [root@quad live]# cat mdstat
> Personalities : [raid6] [raid5] [raid4] [raid10]
> unused devices: <none>
>
> [root@quad ~]# partprobe
> device-mapper: remove ioctl on WDC_WD20EFRX-68AX9N0_WD-WMC301255087p1
> failed: Device or resource busy
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>snip




Are you definately using cables rated for sata III?  Have you checked the power connections?  Have you checked the power supply voltages durning spin up/later?  





Is there tension or major twisting forces on the sata cables?   I've seen this cause intermittent problems and was solved by using a longer cable that reduced the stress at the connector.





Are the drives getting hot (your' model shouldn't have a heat issue under normal conditions).  Are the drives bolted into a system?  Drives can be sensitive to vibration and identical, unmounted drives will tend to shake each other and can produce rotational torque as well (especially when the same model as they'll all have the same resonances in that case).  Either can cause problems with keeping the heads over the track reliably.




I'd definately run all the smart test.  start with the conveyance test and then the short self test, and possibly the long test.   do check the drive temperatures immediately after each test to make sure they aren't getting too hot.





I assume you've done an fsck on the file systems?  If not it might be good to check.




Are you using the mother boards sata interfaces or an add-on card?  If using a card i'd check the firmware version on the card and what the manufacturer is offering for updates.




Are the drives still under warranty?  If so try WD tech support.  Also check that all the Raid tools are properly installed with their' dependencies met.  could be other hardware/drivers interfering.  might reset the bios to "optimized settings".  Which software raid package are you using?





Other than that I'd possibly suspect a software problem, not familiar with software raids myself (haven't used on, know what they are).  Or possibly a problem with the drive that is intermitant or complex in how it fails.


>
>
_______________________________________________
CentOS mailing list
CentOS@xxxxxxxxxx
https://lists.centos.org/mailman/listinfo/centos




[Index of Archives]     [CentOS]     [CentOS Announce]     [CentOS Development]     [CentOS ARM Devel]     [CentOS Docs]     [CentOS Virtualization]     [Carrier Grade Linux]     [Linux Media]     [Asterisk]     [DCCP]     [Netdev]     [Xorg]     [Linux USB]


  Powered by Linux