Re: Reinstall and keep data LV

Greg Oliver <oliver.greg@xxxxxxxxx> · Thu, 30 Dec 2021 18:24:34 -0600

On Wed, Sep 1, 2021 at 1:32 PM David Lehman <dlehman@xxxxxxxxxx> wrote:
On Tue, Aug 31, 2021 at 4:37 PM Markus Falb <wnefal@xxxxxxxxx> wrote:

> On 30.08.2021, at 23:26, Brian C. Lane <bcl@xxxxxxxxxx> wrote:

> 

> On Thu, Aug 26, 2021 at 06:11:49PM +0200, Markus Falb wrote:

> 

>> The solution is to activate the LVs in %pre

>> It turns out that there is /dev/sda present but not the device files for /dev/sdaX.

>> 

>> …snip

>> %pre

>> mknod /dev/sda2 b 8 2

>> pvscan

>> vgchange -ay

>> %end

>> snap…

>> 

>> alternatively this oneliner is working too, interestingly

>> 

>> …snip

>> %pre

>> parted /dev/sda unit MiB print

>> %end

>> snip…

>> 

>> Note that with the parted command it is not necessary to vgchange afterwards.

>> 

>> Is there a builtin kickstart command that accomplishes the same instead of some %pre?

>> If not, why is %pre necessary? %pre was not necessary with RHEL7. Is this by design or is it a bug?

> 

> This is a bug of some sort, as David said. The fact that parted print

> fixed it makes me think that your storage is slow, since all parted does

> is open/close the device and tell the kernel to rescan the partitions --

> which should have already happened at boot time when the device

> appeared.

I am testing with a kvm VM created with Virtual Machine Manager on CentOS 7.

The VM has a scsi disk (changing to IDE or SATA does not change the behaviour)

I remember that I was trying “udevadm settle” in %pre and this was returning

fast and that’s why I thought that it was not waiting for some slow udev event.

I had another look.

I added a sleep 600 and removed the parted from parted (600s should be

plenty of time for detection)

Here is my interpretation:

The kernel *did* detect the partitions in early initramfs

…

Aug 31 14:19:50 localhost kernel:  sda: sda1 sda2

Aug 31 14:19:50 localhost kernel: sd 0:0:0:0: [sda] Attached SCSI disk

…

If I add rd.break kernel parameter I can see that the devices are there.

But after switching root (pivoting) they are gone. I do not know if this 

is expected or not.

The lvs being gone is expected, but the partitions being gone is not expected.

So while %pre is running sda1 and sda2 are not present given that I did

not trigger udev with parted or similar.

After %pre is finished it is detecting sda1 and sda2 again, and it is

finding the VG and the LVs, but then it is stopping the VG (which is what

I find strange) and throwing the error

Stopping the the VG is a normal part of the process. The installer makes a
model of the storage and then deactivates it until it is needed. The partitions
should still be visible, however.

…

initramfs

…

Aug 31 14:19:50 localhost kernel:  sda: sda1 sda2

Aug 31 14:19:50 localhost kernel: sd 0:0:0:0: [sda] Attached SCSI disk

…

pivot root filesystem

…

running pre (10 minutes sleep)

…

Aug 31 14:30:14 test kernel:  sda: sda1 sda2

…

Aug 31 14:30:15 test org.fedoraproject.Anaconda.Modules.Storage[1903]: DEBUG:blivet:                 DeviceTree.handle_device: name: lvm.vg1-root ; info: {'DEVLINKS': '/dev/disk/by-uuid/9c60e33e-03e0-42c4-a583-868f4fd1b2b4 '

…

Aug 31 14:30:15 test org.fedoraproject.Anaconda.Modules.Storage[1903]: INFO:program:Running [36] lvm vgchange -an lvm.vg1 --config= devices { preferred_names=["^/dev/mapper/", "^/dev/md/", "^/dev/sd"] } log {level=7 file=/tmp/lvm.log syslog=0} …

…

Aug 31 14:30:21 test org.fedoraproject.Anaconda.Modules.Storage[1903]: Logical volume "root" given in logvol command does not exist.

…

If someone is interested, I created a gist with the kickstarts and logs at

https://gist.github.com/610acf7379f48d0e5c38f4edb9cda176

(you can clone it with git)

I found no obvious error, but there is a lot of stuff and I could have

missed something easily.

I see in storage.log that it successfully looks up (get_device_by_name)
the lvm.vg1-root LV in its model shortly before the error occurs, which is
strange. I also do not see any failing calls to get_device_by_name for
the root LV once the initial scan has completed.

Given that anaconda sees the LVs, do you still think that it is a kernel

problem or the storage too slow?

Best Regards, Markus

and thanks too all who took the time answering.

I know this is very old, but I am just getting back to readin list emails, and compared to other lists, this one does not have much traffic - which is good because kickstart is doing what is supposed to.

To the OP, I had this same issue when using %pre to not only create arrays on storage controllers, but them trying to read them.  My only ever solution I could find that always worked was "partprobe" sprinkles through my scripts.  The good thing about partprobe is that it blocks and does not return until the kernel says it is ok.  In my case, creating the arrays through the BMC, etc manually would also fix it, but who wants to click all of those buttons in the slow ass BMC guis..

Anyhow, putting a partprobe before every new disk call fixed it for me and all device nodes where properly populated by udev, etc afterward..

-Greg

_______________________________________________
Kickstart-list mailing list
Kickstart-list@xxxxxxxxxx
https://listman.redhat.com/mailman/listinfo/kickstart-list