Re: RAID5 resync question BUGREPORT!

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi,

After i get this on one of my disk node, imediately send this letter, and go
to the hosting company, to see, is any message on the screen.
But unfortunately nothing what i found.
simple freeze.
no message, no ping, no num lock!

The full message of  the node next reboot is here:
http://download.netcenter.hu/bughunt/20051209/boot.log

Next step, i try to restart the whole system. (the concentrator is hangs
too, caused by lost the st-0001 node)
The part of the next reboot message of the concentrator is here:
http://download.netcenter.hu/bughunt/20051209/dy-boot.log

Next step, i stops everything, to awoid more data lost.
Try to remove the possible bitmap from the md0 of  node-1 (st-0001).

The messages is there:
http://download.netcenter.hu/bughunt/20051209/mdadm.log

At this time i cannot remove the broken bitmap, only deactivating the use of
it.
But on next reboot, the node will try to use it again. :(

I have try to change the array to use an external bitmap, but the mdadm
failed to create it too.
The external bitmap file is here: (6 MB!)
http://download.netcenter.hu/bughunt/20051209/md0.bitmap

The error message is the same of internal bitmap creation.

I dont know exactly, what caused the fs-damage, but here is my "possible
list": (sorted)
1. the mdadm  (wrong bitmap size)
2. the kernel (wrong resync on startup)
3. the half written data, caused by first crash.

One question:
On a working array doing the bitmap creation is safe and race-free?
(I mean race between the bitmap-create and bitmap update.)

My data lost finally, really minimal. :-)

Cheers,
Janos


----- Original Message ----- 
From: "Neil Brown" <neilb@xxxxxxx>
To: "JaniD++" <djani22@xxxxxxxxxxxxx>
Cc: <linux-raid@xxxxxxxxxxxxxxx>
Sent: Friday, December 09, 2005 12:43 AM
Subject: Re: RAID5 resync question BUGREPORT!


> On Friday December 9, djani22@xxxxxxxxxxxxx wrote:
> > Hello, Neil,
> >
> > [root@st-0001 mdadm-2.2]# mdadm --grow /dev/md0 --bitmap=internal
> > mdadm: Warning - bitmaps created on this kernel are not portable
> >   between different architectured.  Consider upgrading the Linux kernel.
> >
> > Dec  8 23:59:45 st-0001 kernel: md0: bitmap file is out of date (0 <
> > 81015178) -- forcing full recovery
> > Dec  8 23:59:45 st-0001 kernel: md0: bitmap file is out of date, doing
full
> > recovery
> > Dec  8 23:59:46 st-0001 kernel: md0: bitmap initialized from disk: read
> > 12/12 pages, set 381560 bits, status: 0
> > Dec  8 23:59:46 st-0001 kernel: created bitmap (187 pages) for device
md0
> >
> > And the system is crashed.
> > no ping reply, no netconsole error logging, no panic and reboot.
>
> Hmmm, that's unfortunate :-(
>
> Exactly what kernel were you running?
>
> NeilBrown

-
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux RAID Wiki]     [ATA RAID]     [Linux SCSI Target Infrastructure]     [Linux Block]     [Linux IDE]     [Linux SCSI]     [Linux Hams]     [Device Mapper]     [Device Mapper Cryptographics]     [Kernel]     [Linux Admin]     [Linux Net]     [GFS]     [RPM]     [git]     [Yosemite Forum]


  Powered by Linux