Re: sdc1 does not have a valid v0.90 superblock, not importing!

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, 11 Aug 2010 04:19:07 -0700 (PDT)
Jon Hardcastle <jd_hardcastle@xxxxxxxxx> wrote:

> 
> --- On Wed, 11/8/10, Neil Brown <neilb@xxxxxxx> wrote:
> 
> > From: Neil Brown <neilb@xxxxxxx>
> > Subject: Re:  sdc1 does not have a valid v0.90 superblock, not importing!
> > To: Jon@xxxxxxxxxxxxxxx
> > Cc: jd_hardcastle@xxxxxxxxx, linux-raid@xxxxxxxxxxxxxxx
> > Date: Wednesday, 11 August, 2010, 12:06
> > On Wed, 11 Aug 2010 02:55:44 -0700
> > (PDT)
> > Jon Hardcastle <jd_hardcastle@xxxxxxxxx>
> > wrote:
> > 
> > > (my first attempt appears to have been bounced as the
> > spam checker thought it had HTML in it?!)
> > 
> > odd... came through ok for me the first time.
> > 
> > > 
> > > Help!
> > > 
> > > Long story short - I was watching a movie off my RAID6
> > array. Got a smart error warning
> > 
> > > Aug 10 22:00:07 mangalore kernel: raid5: cannot start
> > dirty degraded array for md4
> > 
> > This is the current problem.  The array is dirty and
> > degraded so there could
> > theoretically be undetectable corruption.  Chance is
> > quite low but it is
> > there so md won't start with out you acknowledging the risk
> > by giving the
> > --force flag to mdadm --assemble.
> > Only do that if you are confident that your hardware is
> > working correctly.
> 
> Well I am reasonable sure the controller came adrift the first time.. when i reseated it i stopped getting 100's of errors.. and it has survived 1.5 badblocks checks. It is being held in place by one of those bars you press down (does all the expansion cards in 1 go) except i dont think it is very good. I will screw it down.
> 
> > 
> > > It appears sdc has an invalid superblock?
> > > 
> > > This is the 'examine' from sdc1 (note the checksum)
> > > 
> > > /dev/sdc1:
> > .....
> > >       Checksum : b335b4e3 -
> > expected b735b4e3
> > 
> > Single bit error.  That isn't good as it means some
> > bit of memory or some bit
> > on some bus somewhere cannot be trusted.
> > It could be a transient thing and will never happen
> > again.  Or maybe not.
> > Given the smart errors and the fact that you have had
> > problems with the drive
> > before it seem very likely that the problem is in that
> > drive.  I suggest
> > unplugging it and leaving it unplugged.  Some memory
> > buffer in the drive is
> > probably marginal.  I don't think they use ECC
> > memory.
> 
> Could this be a result of me forcing a power off when the drive was causing problems?

Probably not.  Forcing a power off may well have left the array 'dirty' so
that it wouldn't assemble, but is fairly unlikely to corrupt data within a
block.

> 
> What are the dangers to removing it, zeroing the superblock and readding? is it MORE dangerous than leaving a raid 6 degraded for a few days?

In general, I would say the chance of a known-bad drive causing problems is
greater than the chance of a fewer known-good drives causing problems.
But then you seem to think it isn't the drive, it was the controller and that
is fixed...

This is really about your level of trust in the hardware.
If you trust sdc as much as the others, include it in the array.
If you don't, then don't.

NeilBrown



> 
> > 
> > > 
> > > Anyways... I am ASSUMING mdadm has not assembled the
> > array to be on the safe side? i have not done anything.. no
> > force... no assume clean.. I wanted to be sure?
> > 
> > You assume correctly.
> > 
> > > 
> > > Should i remove sdc1 from the array? It should then
> > assemble? I have 2 spare drives that I am getting around to
> > using to replace this drive and the other 500GB.. so should
> > I remove sdc1... and try and re-add or just put the new
> > drive in?
> > > 
> > > atm I have 'stop'ped the array and got badblocks
> > running....
> > > 
> > 
> > Remove sdc and assemble the array with --force, and get a
> > new device to
> > replace /dev/sdc as soon as possible.
> 
> Thanks Neil - I panic'd as previously it has mounted the array in a degraded state... but previously the drive has disappeared completely... whereas in this case it is present... but wrong!
> 
> > 
> > NeilBrown
> > --
> > To unsubscribe from this list: send the line "unsubscribe
> > linux-raid" in
> > the body of a message to majordomo@xxxxxxxxxxxxxxx
> > More majordomo info at  http://vger.kernel.org/majordomo-info.html
> > 
> 
> 
>       

--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [Linux RAID Wiki]     [ATA RAID]     [Linux SCSI Target Infrastructure]     [Linux Block]     [Linux IDE]     [Linux SCSI]     [Linux Hams]     [Device Mapper]     [Device Mapper Cryptographics]     [Kernel]     [Linux Admin]     [Linux Net]     [GFS]     [RPM]     [git]     [Yosemite Forum]


  Powered by Linux