Open Shared Root GlusterFS Patches and HowTo

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Here's a preliminary version based on the OSR RHEL5 GFS mini-howto from here: http://www.open-sharedroot.org/documentation/rhel5-gfs-shared-root-mini-howto

Modified/added files are attached.

Prerequisites
-------------
Freshly installed RHEL5.

Install RHCS components:
# yum install cman

Install Com.oonics Packages
---------------------------

# yum install comoonics-bootimage \
		comoonics-cdsl-py \
		comoonics-bootimage-extras-glusterfs

[Note: I'm assuming the package for this will be called extras-glusterfs since that was what the DRBD one I submitted ended up being called.]

Install GlusterFS patched fuse packages:
# wget http://ftp.gluster.com/pub/gluster/glusterfs/fuse/fuse-2.7.3-2.src.rpm

# rpmbuild --rebuild fuse-2.7.3-2.src.rpm
# rpm -Uvh /usr/src/redhat/RPMS/x86_64/fuse-2.7.3-2.x86_64.rpm \
	/usr/src/redhat/RPMS/x86_64/fuse-libs-2.7.3-2.x86_64.rpm \
/usr/src/redhat/RPMS/x86_64/fuse-kernel-module-2.6.18-92.1.22.el5-2.7.3-2.x86_64.rpm

# wget http://ftp.gluster.com/pub/gluster/glusterfs/2.0/2.0.0/glusterfs-2.0.0rc1.tar.gz

# rpmbuild -tb glusterfs-2.0.0rc1.tar.gz
# rpm -Uvh /usr/src/redhat/RPMS/x86_64/glusterfs-2.0.0rc1-1.x86_64.rpm

[Note: The current versions may change, the ones listed are correct at the time of writing this document. The paths above also assume x86-64 architecture.]

Create a cluster configuration file /etc/cluster/cluster.conf with the com_info tags. This time fencing isn't mandatory if you aren't using resource failover, as unline GFS, GlusterFS won't block if a peer goes away. For GlusterFS splitbrain caveats and handling see http://www.gluster.org/

[cluster.conf attached]

A quick note about the disk layout of the underlying disk. This howto assumes the following:
/dev/sda1 /boot
/dev/sda2 /
/dev/sda3 swap
/dev/sda4 comoonics-chroot (so we can deallocate the initrd)

On /, we are assumed to have a directory /gluster/root which contains the gluster rootfs.

Create the GlusterFS root volume specification in /etc/glusterfs/root.vol. Here is an example of a simple volume spec file for a 2-server AFR (mirroring) configuration where each server has a local copy of the data. Note that you could also do this diskless, with the rootfs being on remote server(s) much like NFS, and even distributed or striped across multiple servers. See GlusterFS documentation for details.

[root.vol attached]

Mount the glusterfs file system:
# mkdir /mnt/newroot
# mount -t glusterfs /etc/glusterfs/root.vom /mnt/newroot

Copy all data from the local installed RHEL5 root filesystem to the shared root filesystem:

[ ... the rest of the section is identical to the GFS howto ... ]

Make sure you apply the patches from /opt/atix/comoonics-bootimage/patches to the init scripts in the new root, especially the network and halt init scripts!

Note that the RPM library in it's default configuration WILL NOT work under GlusterFS. GlusterFS is fuse based and thus doesn't support writable mmap(), which BerkeleyDB (default RPM database format) requires to function. To work around this problem, we can convert the RPM database to use SQLite. The functionality is already built into RHEL5 RPM packages, we just need to do the following:


# rpm -v --rebuilddbapi 4 --dbpath /var/lib/rpm --rebuilddb

and then change the following lines in /usr/lib/rpm/macros:
%_dbapi 3
%_dbapi_rebuild 3

to

%_dbapi 4
%_dbapi_rebuild 4

[Note: This should _probably_ be another patch in /opt/atix/comoonics-bootimage/patches, trivial as it may be.]

[Note: Updated network.patch attached, the current one in the repo didn't seem to apply cleanly, and I added the exclusion of network disconnection when GlusterFS is used.]

[Note: You cannot use a GlusterFS based shared boot per se, but you COULD use GlusterFS to keep /boot in sync and boot off it's backing storage device. No new devices need be created, only an additional volume spec using the /boot volume as the backing store for GlusterFS. All operations on top of GlusterFS would cause the /boot device to get mirrored across the machines. This is only meaningful with AFR/mirroring. Also note that grub is virtually guaranteed to get horribly confused when asked to make a GlusterFS based file system bootable. In conclusion - don't do this unless you understand what I'm talking about here and know what you're doing.]

Create the shared root initrd as per usual:

/opt/atix/comoonics-bootimage/mkinitrd -f /mnt/newroot/boot/initrd_sr-$(uname -r).img $(uname -r)


Final note: You can side-step the copying of the root FS by operating directly on the master copy. This means you won't have to then manually go and delete the initial installation (except for the /gluster directory), but it also means that any mistakes along the way will render the system unusable and you'll have to re-install from scratch and try again. You would then, of course, need to change the path in root.vol from /mnt/tmproot/glusterfs/root to /mnt/tmproot.

Awaiting peer review. :)

Gordan
--- network.orig	2009-01-15 22:29:54.000000000 +0000
+++ network		2009-01-15 22:33:46.000000000 +0000
@@ -10,6 +10,8 @@
 # Provides: $network
 ### END INIT INFO
 
+# Patched for comoonics patch 1.3
+
 # Source function library.
 . /etc/init.d/functions
 
@@ -171,10 +173,10 @@
   stop)
   	# Don't shut the network down if root is on NFS or a network
 	# block device.
-        rootfs=$(awk '{ if ($1 !~ /^[ \t]*#/ && $2 == "/") { print $3; }}' /etc/mtab)
-        rootopts=$(awk '{ if ($1 !~ /^[ \t]*#/ && $2 == "/") { print $4; }}' /etc/mtab)
-	
-	if [[ "$rootfs" =~ "^nfs" ]] || [[ "$rootopts" =~ "_netdev|_rnetdev" ]] ; then
+	  rootfs=$(awk '{ if ($1 !~ /^rootfs/ && $1 !~ /^[ \t]*#/ && $2 == "/") { print $3; }}' /etc/mtab)
+	  rootopts=$(awk '{ if ($1 !~ /^rootfs/ && $1 !~ /^[ \t]*#/ && $2 == "/") { print $4; }}' /etc/mtab)
+
+	if [[ "$rootfs" =~ "^nfs|^gfs|^gluster" ]] || [[ "$rootopts" =~ "_netdev" ]] ; then
 		exit 1
 	fi
   
<?xml version="1.0"?>
<cluster config_version="2" name="groot">
	<cman two_node="1" expected_votes="1"/>
	<fence_daemon post_fail_delay="0" post_join_delay="3"/>
	<clusternodes>
		<clusternode name="groot1" nodeid="1" votes="1">
			<com_info>
				<rootsource name="/dev/sda2"/>
				<chrootenv	mountpoint	= "/var/comoonics/chroot"
						fstype		= "ext3"
						device		= "/dev/sda4"
						chrootdir	= "/var/comoonics/chroot"
				/>
				<syslog name="skynet"/>
				<rootvolume	name		= "/etc/glusterfs/root.vol"
						mountopts	= "defaults,noatime,nodiratime"
						fstype		= "glusterfs"
				/>
				<eth	name	= "eth0"
					ip	= "192.168.10.1"
					mac	= "00:0C:29:A9:7C:9E"
					mask	= "255.255.0.0"
					gateway	= "192.168.255.254"
				/>
			</com_info>
		</clusternode>
		<clusternode name="groot2" nodeid="2" votes="1">
			<com_info>
				<rootsource name="/dev/sda2"/>
				<chrootenv	mountpoint	= "/var/comoonics/chroot"
						fstype		= "ext3"
						device		= "/dev/sda4"
						chrootdir	= "/var/comoonics/chroot"
				/>
				<syslog name="skynet"/>
				<rootvolume	name		= "/etc/glusterfs/root.vol"
						mountopts	= "defaults,noatime,nodiratime"
						fstype		= "glusterfs"
				/>
				<eth	name	= "eth0"
					ip	= "192.168.10.2"
					mac	= "00:0C:29:A9:7C:9F"
					mask	= "255.255.0.0"
					gateway	= "192.168.255.254"
				/>
			</com_info>
		</clusternode>
	</clusternodes>
	<cman/>
	<rm/>
</cluster>
volume root2
	type protocol/client
	option transport-type socket
	option address-family inet
	option remote-host 192.168.10.2
	option remote-subvolume root2
end-volume

volume root-store
	type storage/posix
	option directory /mnt/tmproot/gluster/root
end-volume

volume root1
	type features/posix-locks
	subvolumes root-store
end-volume

volume server
	type protocol/server
	option transport-type socket
	option address-family inet
	subvolumes root1
	option auth.addr.root1.allow 127.0.0.1,192.168.*
end-volume

volume root
	type cluster/afr
	subvolumes root1 root2
	option read-subvolume root1
end-volume

Attachment: glusterfs-lib.sh
Description: Bourne shell script

/etc/glusterfs/root.vol
/tmp
fuse
glusterfs
libibverbs
vim-common
vim-minimal

[Index of Archives]     [Gluster Users]     [Ceph Users]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Security]     [Bugtraq]     [Linux]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux