Re: mdadm: recovering from an aborted reshape op - boot messages

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Neil,

Comments interspersed..

--- On Tue, 15/2/11, NeilBrown <neilb@xxxxxxx> wrote:

> From: NeilBrown <neilb@xxxxxxx>
> Subject: Re: mdadm: recovering from an aborted reshape op - boot messages
> To: "Gavin Flower" <gavinflower@xxxxxxxxx>
> Cc: linux-raid@xxxxxxxxxxxxxxx
> Date: Tuesday, 15 February, 2011, 12:55
> On Mon, 14 Feb 2011 14:47:48 -0800
> (PST) Gavin Flower <gavinflower@xxxxxxxxx>
> wrote:
> 
> > Hi Neil,
> > 
> > I did not notice this before (note: I have poor
> eyesight, so unless I explicitly look, I may not notice
> things!). but just before Fedora drops to the shell on a
> reboot I saw these messages (hand transcribed, so might have
> the odd transcription error):
> > 
> > /dev/md1: The filing system size (according to the
> superblock) is 76799952 blocks
> > The physical size of the device is 76799616
> > Either the superblock or the partition table is likely
> to be corrupt!
> > 
> > /dev/md1: UNEXPECTED INCONSISTENCY: RUN fsck manually
> > (i.e. without -a or -p options)
> > 
> > Note that original size according mdadm was not a
> multiple of 512KB, so I reshaped it to be the largest
> multiple or 512KB less than the original size.  So my
> second attempt to reshape, using the 512 chunk size, started
> okay.
> > 
> > Advice appreciated.
> 
> Hmmm....
> 
> Firstly, the -A and -E output you sent are inconsistent.

I can not explain the inconsistency.

However, they were both done on the same machine ('saturn').

No software updates were done on 'saturn' since before the reshaping.

The -A output was the process that took over an hour.

> The "-A" output reports:
> 
> mdadm:/dev/md1 has an active reshape - checking if critical
> section needs to be restored
> 
> For 0.90 metadata (which you are using), that can only be
> reported if the
> minor number is at least 91.  i.e. it has been
> temporarily set to 0.91.
> 
> However the "-E" output show that all devices are
> "0.90.00", not 0.91.

I grepped strings /sbin/mdadm for '.9', and found both '0.90' and '0.91' - for what it is worth.

ls on /sbin/mdadm gives the size of 362296 bytes and the date 5 Aug 2010.
version is v3.1.2 - 10th March 2010

> 
> So those devices cannot possibly produce that -A output.

The output was sent directly to the USB stick, so there are no transcription errors.  So as far as I can tell, these devices did produce the output. They are the only devices I have accessed using RAID many months.  There are only the 5 hard disks on 'saturn'.

Is there anything I can do to track down this anomaly?

> 
> The devices appear to have all completely transitioned to
> 512K chunksize....
> 
> And the -D output seems to show that the array is fine and
> working properly.
> 
> Secondly, as you say you reshaped the array to make it
> slightly smaller so it
> would be a multiple of 512K.  This is obviously needed
> to change the chunk
> size.

I used the âsize=
 option of mdadm
> 
> But before you did that - did you resize the filesystem to
> be only that big?

No, and there is no mention in man mdadm to do so, that I could see.

> I suspect not.  So the filesystem thinks that it is
> bigger than the device.
> I don't know how best to fix that.

I would have thought mdadm would have done that as part of the process â as surely the size of the filesystem could not be reduced in advance of the reshaping.

Perhaps, I have overlooked the obvious?

> 
> You could try running 'resize2fs" now (was it ext3? I don't
> remember).  Or
> maybe an 'fsck -f' might fix it.
> 
> It might be safest to ask on ext3-users@xxxxxxxxxxx 
> Report that you shrunk
> your array before shrinking the filesystem and ask what the
> best remedial
> strategy is.
> 
> NeilBrown
> 
> 

I will look into your other suggestions about recovery.

If there is anything further I can do, to provide useful diagnostics, please let me now.


Thanks,
Gavin


      
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [Linux RAID Wiki]     [ATA RAID]     [Linux SCSI Target Infrastructure]     [Linux Block]     [Linux IDE]     [Linux SCSI]     [Linux Hams]     [Device Mapper]     [Device Mapper Cryptographics]     [Kernel]     [Linux Admin]     [Linux Net]     [GFS]     [RPM]     [git]     [Yosemite Forum]


  Powered by Linux