On 3/4/23 19:37, Genes Lists wrote:
On 3/4/23 13:21, Uwe Sauter wrote:
The usual Linux MD-RAID can have its metadata placed on different
positions in the partition (see man (8) mdadm, option "-e, --metadata").
Knowing this it is no problem to create a partition on each disk of
type EF00, create a RAID1 with metadata version 1.0 (at end of
partition) using those partitions and format that MD device with
VFAT. Mountpoint should be /boot/efi.
Thus EFI will see two VFAT partitions with the correct type but Linux
will keep the content synchronized.
There is at least one more thing to configure: /etc/mdadm.conf should
include a line for this MD device. Best would be to reference the MD
device by UUID.
It might be required to also configure the kernel cmdline to include
options to assemble the device. But I might confuse this with RHEL
(Dracut) based distributions. I think Arch's mkinitrd will use
/etc/mdadm.conf when properly configured…
This could be nicer way to go on new installs but might be a bit
tedious to do on existing systems though.
I have this setup on all servers that do not have battery backed HW raid
cards and use mdadm there. I use systemd-boot as bootloader. Works well
and can be done on existing system with just a single reboot. It is not
easy - you have to create degraded raid1 on new drive, rsync all data,
boot from usb, rsync changes that were made during initial resync, boot
from degraded raid and convert original drive to second raid1 member.
You have to use efibootmgr to manually setup both boot entries, "bootctl
update" will not work.
In this setup there is a risk that UEFI firmware will write something to
one of partitions and raid1 will degrade, but on all of my four machines
I never experienced something like this. Even if this happens system
should boot without problems.
As for dual-root, I do not think it is safe to rsync running system. For
example postfix uses inode numbers for queue files [1], so you need to
use postsuper[2] to fix it after copy. All databases will sooner or
later break, because they are well protected from sudden power loss, but
not protected from situation when files are simultaneously copied and
written by database process. Other software that uses multiple file
databases (like samba) will probably break too. It is just a matter of luck.
Your copy script is also missing flags for hard links, ACL's and
extended attributes, you should use -axHAX --delete to create proper mirror.
It is much better to use lvm, create snapshots for all mounted
filesystems, mount them, copy and delete just after. Then after booting
from second root you will be more-less in the same situation as after
unexpected power loss. More-less because it is still impossible to
create all snapshots at exact the same point in time.
Regards,
Łukasz
[1] https://marc.info/?l=postfix-users&m=105009113626092&w=2
[2] https://man.archlinux.org/man/postsuper.1.en