Fwd: [Linux-cluster] inconsistend volume group after pvmove

Jonathan Brassow <jbrassow@redhat.com> · Wed, 2 Jul 2008 09:16:48 -0500

Spotted this message on linux-cluster...

It seems to me that the LVM label on /dev/sdh still needs to be wiped (pvremove /dev/sdh)...  [Although, I'm not sure the PV has been removed from the VG yet... unless by chance it was failing when they did the 'vgreduce'...  and that wouldn't explain why there were problems /before/ the vgreduce.]  I'm not sure how it got to the point of having inconsistent metadata after running the pvmove.  Also note that this is done using CLVM.

Anyone have ideas?

 brassow

Begin forwarded message:

From: "Andreas Schneider" <andreas.schneider@f-it.biz>
Date: July 1, 2008 3:02:18 AM CDT
To: <linux-cluster@redhat.com>
Subject: [Linux-cluster] inconsistend volume group after pvmove
Reply-To: linux clustering <linux-cluster@redhat.com>

Hello,
This is our setup: We have 3 Linux servers (2.6.18 Centos 5), clustered, with a clvmd running one “big” volume group (15 SCSI disks a 69,9 GB).
After we got an hardware I/O error on one disk out gfs filesystem began to loop.
So we stopped all services and we determined the corrupted disk (/dev/sdh) and my intention was to do the following:
-          pvmove /dev/sdh
-          vgreduce my_volumegroup /dev/sdh
-          do an intensive hardware check on the volume

But: that’s what happened during pvmove –v /dev/sdh:
…….
/dev/sdh: Moved: 78,6%
  /dev/sdh: Moved: 79,1%
  /dev/sdh: Moved: 79,7%
  /dev/sdh: Moved: 80,0%
    Updating volume group metadata
    Creating volume group backup "/etc/lvm/backup/myvol_vg" (seqno 46).
  Error locking on node server1: device-mapper: reload ioctl failed: Das Argument ist ungültig
  Unable to reactivate logical volume "pvmove0"
  ABORTING: Segment progression failed.
    Removing temporary pvmove LV
    Writing out final volume group after pvmove
    Creating volume group backup "/etc/lvm/backup/myvol_vg" (seqno 48).
[root@hpserver1 ~]# pvscan
  PV /dev/cciss/c0d0p2   VG VolGroup00   lvm2 [33,81 GB / 0    free]
  PV /dev/sda            VG fit_vg       lvm2 [68,36 GB / 0    free]
  PV /dev/sdb            VG fit_vg       lvm2 [68,36 GB / 0    free]
  PV /dev/sdc            VG fit_vg       lvm2 [68,36 GB / 0    free]
  PV /dev/sdd            VG fit_vg       lvm2 [68,36 GB / 0    free]
  PV /dev/sde            VG fit_vg       lvm2 [66,75 GB / 46,75 GB free]
  PV /dev/sdf            VG fit_vg       lvm2 [68,36 GB / 0    free]
  PV /dev/sdg            VG fit_vg       lvm2 [68,36 GB / 0    free]
  PV /dev/sdh            VG fit_vg       lvm2 [68,36 GB / 58,36 GB free]
  PV /dev/sdj            VG fit_vg       lvm2 [68,36 GB / 54,99 GB free]
  PV /dev/sdi            VG fit_vg       lvm2 [68,36 GB / 15,09 GB free]
  PV /dev/sdk1           VG fit_vg       lvm2 [68,36 GB / 55,09 GB free]
  Total: 12 [784,20 GB] / in use: 12 [784,20 GB] / in no VG: 0 [0   ]

That sounded bad, and I didn’t have any idea what to do, but read, that pvmove can start at the point it was, so I started pvmove againg and now pvmove could move all data.
pvscan and vgscan -vvv showed me, that all data were moved from the /dev/sdh volume to the other volumes.

To be sure I restarted my cluster nodes, but they encountered problems mounting the gfs filesystems.
I got this error:

[root@server1 ~]# /etc/init.d/clvmd stop
Deactivating VG myvol_vg:   Volume group "myvol_vg" inconsistent
  WARNING: Inconsistent metadata found for VG myvol_vg - updating to use version 148
  0 logical volume(s) in volume group "myvol_vg" now active
                                                           [  OK  ]
Stopping clvm:                                             [  OK  ]
[root@server1 ~]# /etc/init.d/clvmd start
Starting clvmd:                                            [  OK  ]
Activating VGs:   2 logical volume(s) in volume group "VolGroup00" now active
  Volume group "myvol_vg" inconsistent
  WARNING: Inconsistent metadata found for VG myvol_vg - updating to use version 151
  Error locking on node server1: Volume group for uuid not found: tGRfaK5aW00pFRXcLtrdHAw5a4GNDVBtuFZZe8QKoX8sVA0XRTNoDQVWVftk8cSa
  Error locking on node server1: Volume group for uuid not found: tGRfaK5aW00pFRXcLtrdHAw5a4GNDVBtqDfFtrJTFTGuju8nNjwtCdPGnzP3hh8k
  Error locking on node server1: Volume group for uuid not found: tGRfaK5aW00pFRXcLtrdHAw5a4GNDVBtc22hBY40phdVvVdFBFX28PvfF7JrlIYz
  Error locking on node server1: Volume group for uuid not found: tGRfaK5aW00pFRXcLtrdHAw5a4GNDVBtWfJ1EqXJ309gO3Gx0ZvpNekrmHFo9u2V
  Error locking on node server1: Volume group for uuid not found: tGRfaK5aW00pFRXcLtrdHAw5a4GNDVBtCP6czghnQFEjNdv9DF6bsUmnK3eJ5vKp
  Error locking on node server1: Volume group for uuid not found: tGRfaK5aW00pFRXcLtrdHAw5a4GNDVBt0KNlnblpwOfcnqIjk4GJ662dxOsL70GF
  0 logical volume(s) in volume group "myvol_vg" now active
                                                           [  OK  ]

As I take a look at it, these 6 volumes are exactly the LVs which should be found and where all datas are stored.

The next step was in the beginning step by step and in the end stupid try and error.
This was one of the first actions:

[root@hpserver1 ~]# vgreduce --removemissing myvol_vg
    Logging initialised at Tue Jul  1 10:00:52 2008
    Set umask to 0077
    Finding volume group "myvol_vg"
    Wiping cache of LVM-capable devices
  WARNING: Inconsistent metadata found for VG myvol_vg - updating to use version 229
  Volume group "myvol_vg" is already consistent

We tried to deactivate the volume via vgchange –n y myvol_vg, we tried to “removemissing” and sadly after a few concurrent tries (dmsetup info –c, dmsetup mknodes and vgchange –n y myvol_vg) we can access our LVs, but we still get this message and we don’t know why:

  Volume group "myvol_vg" inconsistent
  WARNING: Inconsistent metadata found for VG myvol_vg - updating to use version 228

I’m a little bit worried about our data,

Regards
Andreas

--
Linux-cluster mailing list
Linux-cluster@redhat.com
https://www.redhat.com/mailman/listinfo/linux-cluster

_______________________________________________
linux-lvm mailing list
linux-lvm@redhat.com
https://www.redhat.com/mailman/listinfo/linux-lvm
read the LVM HOW-TO at http://tldp.org/HOWTO/LVM-HOWTO/