Re: Problem Syncing raid1

Mathias Burén <mathias.buren@xxxxxxxxx> · Thu, 3 Apr 2014 20:58:10 +0100

On 3 April 2014 20:54, Christian Schmitz <schnet@xxxxxxxxxxx> wrote:
> Well i do
> dd if=server.img of=/dev/sde2
> where sde is the new disk for server data.
> dd was aborted saying input/output error.
> at 750mb ( 0.7G of 550G of partition)
> So as you say i do
> smartctl -t long /dev/sde2
> It takes 150 minutes.
> After that i do:
> dd if=server.img of=/dev/sde2
> Is running since 6 hours so seem solved 750mb error.
>
> I am starting  the smartctl -t long to all hard disk jejejeje.
>
> Best regards
> Christian
>
> be free, be linux
>
>
> ----------------------------------------
>> From: schnet@xxxxxxxxxxx
>> To: linux-raid@xxxxxxxxxxxxxxx
>> Subject: RE: Problem Syncing raid1
>> Date: Wed, 2 Apr 2014 22:18:10 +0000
>>
>> Thanks, i will run smartctl as soon i finish recover the server data.
>>
>> Tomorrow i will give news.
>>
>> Best regards.
>> Christian
>>
>>
>> be free, be linux
>>
>>
>> ----------------------------------------
>>> Date: Wed, 2 Apr 2014 18:08:31 -0400
>>> Subject: Re: Problem Syncing raid1
>>> From: sdvileskis@xxxxxxxxx
>>> To: schnet@xxxxxxxxxxx
>>> CC: mathias.buren@xxxxxxxxx; linux-raid@xxxxxxxxxxxxxxx
>>>
>>> The good news is your realloc_sector_cnt is 0.
>>>
>>> However, run:
>>> smartctl -t long /dev/sda
>>> smartctl -t long /dev/sdb
>>>
>>> if you haven't done it in a while
>>> It will tell the controller on the disk to check all the sectors and remap any.
>>>
>>> It will generally take several hours to complete, but if SMART detects
>>> any bad sectors it will attempt to remap them. (The OS and badblocks
>>> won't know the difference)
>>> You can run the smart test while the disk is online/mounted without
>>> problems, but the more you lay off the disks, the faster it will run.
>>>
>>> On Wed, Apr 2, 2014 at 5:56 PM, Christian Schmitz <schnet@xxxxxxxxxxx> wrote:
>>>> The system freeze, i suspect that is a bad sector because is allways in the same point of syncing.
>>>>
>>>> I do a low level format and all test to SDB, so the problem is in SDA.
>>>>
>>>> Like a deja vu my server was exactly the same problem, now i am doing the ddrescue to server disk and found badblock. This reforce the badblock intuition.
>>>>
>>>> #smartctl -a /dev/sda
>>>> SMART overall-health self-assessment test result: PASSED
>>>> ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE
>>>> 1 Raw_Read_Error_Rate 0x002f 200 193 051 Pre-fail Always - 25
>>>> 3 Spin_Up_Time 0x0027 201 196 021 Pre-fail Always - 916
>>>> 4 Start_Stop_Count 0x0032 099 099 000 Old_age Always - 1032
>>>> 5 Reallocated_Sector_Ct 0x0033 200 200 140 Pre-fail Always - 0
>>>>
>>>> I think that badblock was caused in power failure, so is a logical badblock and not phisical badblock. But to lowlevel format, and run the manufacturer tools i need sync the raid to ship the data to SDB disk.
>>>>
>>>> ¿How i can use badblock in a raid device?
>>>> badblock /dev/md0?
>>>> badblock /dev/sda2?
>>>> The raid hangup if badblock does something?
>>>>
>>>> Thanks for your answer.
>>>> Christian
>>>> be free, be linux
>>>>
>>>>
>>>> ----------------------------------------
>>>>> Date: Wed, 2 Apr 2014 22:36:49 +0100
>>>>> Subject: Re: Problem Syncing raid1
>>>>> From: mathias.buren@xxxxxxxxx
>>>>> To: schnet@xxxxxxxxxxx
>>>>> CC: linux-raid@xxxxxxxxxxxxxxx
>>>>>
>>>>> On 2 April 2014 21:56, Christian Schmitz <schnet@xxxxxxxxxxx> wrote:
>>>>>> Hi everyone,
>>>>>> I have a linux with the following configuration:
>>>>>> /dev/md0 ( degraded mode) (/dev/sda2)
>>>>>>
>>>>>> /dev/md1 ( full mode) (/dev/sda3 /dev/sdb3)
>>>>>>
>>>>>> The problem is if i add /dev/sdb2 to /dev/md0 start the syncing, but when reach 7% the system crash.
>>>>>> I am sure that the main problem is one ( or more) badblock into /dev/sda2.
>>>>>
>>>>> Crash? What does that mean? Kernel panic? Freeze? Reboot?
>>>>> How do you know sda2 (which is sda) has bad blocks? Does it tell you?
>>>>> Did you check the SMART data and run badblocks (non-destructivei f you
>>>>> want) on it?
>>>>>
>>>>>>
>>>>>> How i can do to syncing the disk?.
>>>>>> Obviously Is the first step to change the disk.
>>>>>
>>>>> If it's broken, sure.
>>>>>
>>>>>>
>>>>>> Best Regards
>>>>>> Christian
>>>>>>
>>>>>
>>>>>
>>>>> REgards,
>>>>> Mathias
>>>> --
>>>> To unsubscribe from this list: send the line "unsubscribe linux-raid" in
>>>> the body of a message to majordomo@xxxxxxxxxxxxxxx
>>>> More majordomo info at http://vger.kernel.org/majordomo-info.html
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-raid" in
>> the body of a message to majordomo@xxxxxxxxxxxxxxx
>> More majordomo info at http://vger.kernel.org/majordomo-info.html
>                                           --
> To unsubscribe from this list: send the line "unsubscribe linux-raid" in
> the body of a message to majordomo@xxxxxxxxxxxxxxx
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

(please don't top post)

IT appears you are jumping all over the place. Why don't you

a) smartctl -t long on all drives involved
b) Post the smartctl -a output of all drives
c) run badblocks in destructive or non-destructive mode (dd is not the
same, cut can give an indication) on the drives which aren't healthy
d) take it from there

Mathias
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html