Re: weird issues with raid1

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, Dec 17, 2008 at 10:50 PM, Jon Nelson
<jnelson-linux-raid@xxxxxxxxxxx> wrote:
> On Wed, Dec 17, 2008 at 10:42 PM, Neil Brown <neilb@xxxxxxx> wrote:
>> On Tuesday December 16, neilb@xxxxxxx wrote:
>>> On Monday December 15, jnelson-linux-raid@xxxxxxxxxxx wrote:
>>> > On Mon, Dec 15, 2008 at 3:33 PM, Neil Brown <neilb@xxxxxxx> wrote:
>>> > > On Monday December 15, jnelson-linux-raid@xxxxxxxxxxx wrote:
>>> > >>
>>> > >> Aha!  This explains a question I raised in another email. What
>>> > >> happened there is a previously fully active member of the raid got
>>> > >> added, somehow, as a spare, via --incremental. That's when the entire
>>> > >> raid thought it needed to be rebuilt. How did that (the device being
>>> > >> treated as a spare instead of as a previously fully active member)
>>> > >> happen?
>>> > >
>>> > > It is hard to guess without details, and they might be hard to collect
>>> > > after the fact.
>>> > > Maybe if you have the kernel logs of when the server rebooted and the
>>> > > recovery started, that might contain some hints.
>>> >
>>> > I hope this helps.
>>>
>>> Yes it does, though I generally prefer to get more complete logs.  If
>>> I get the surrounding log lines then I know what isn't there as well
>>> as what is - and it isn't always clear at first which bits will be
>>> important.
>>>
>>> The problem here is that --incremental doesn't provide the --re-add
>>> functionality that you are depending on.  That was an oversight on my
>>> part.  I'll see if I can get it fixed.
>>> In the mean time, you'll need to use --re-add (or --add, it does the
>>> same thing in your situation) to add nbd0 to the array.
>>
>> Actually, I'm wrong.
>> --incremental does do the right thing w.r.t. --re-add.
>> I couldn't reproduce your symptoms.
>
> OK.
>
>> It could be that you are hitting the bug fixed by
>>  commit a0da84f35b25875870270d16b6eccda4884d61a7
>
> That sure sounds like it. I'd have to log to see what happened,
> exactly, but I've added substantial logging around the device
> discovery and addition section which manages this particular raid.
>
>> You would need 2.6.26 or later to have that fixed.
>> Can you try with a newer kernel???
>
> I hope to be giving opensuse 11.1 a try soon, which uses 2.6.27.X
> afaik.  I suspect I can also backport that patch to 2.6.25 easily.

The kernel source for 2.6.25.18-0.2 (from suse) has this patch
already, so I was already using it.

Perhaps this weekend or some night this week I'll find time to try to
break things again.

-- 
Jon
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux RAID Wiki]     [ATA RAID]     [Linux SCSI Target Infrastructure]     [Linux Block]     [Linux IDE]     [Linux SCSI]     [Linux Hams]     [Device Mapper]     [Device Mapper Cryptographics]     [Kernel]     [Linux Admin]     [Linux Net]     [GFS]     [RPM]     [git]     [Yosemite Forum]


  Powered by Linux