Revised: [PATCH] turboLiveInst - improves livecd/usb installer speed by 15-20+%

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Attached is a revised version of my turboLiveInst patch to livecd-tools and anaconda.

This version is more polished. I.e. bugs have been fixed, complexity removed, and therefore should be easier to review.

I performed some anecdotal performance tests, on a sony vaio vgn-n250e. I used a 30G destination volume for all tests, and when using usbstick media, it was media that reported 8.5MB/s from hdparm -t. I did have selinux disabled, and did not use the prelink option. I'd love to hear performance numbers from differing test rigs.

The performance results I got were-

install from cdrom without turboLiveInst:

copy: 250s
postinst: 86s

install from cdrom with turboLiveInst:

copy: 299s
postinst: 84s

install from usb without turboLiveInst:

copy: 226s
postinst: 72s

install from usb with turboLiveInst:

copy: 175s
postinst: 72s

Conclusions:

On this testrig, installing from cdrom, turboLiveInst yielded a 20% speedup in copy, which resulted in an end to end install speedup of 15%. Installing from usb, turboLiveInst yielded a 29% speedup in copy, which resulted in an end to end install speedup of 20%.

I did test copy-to-ram mode, and the resulting benefits were laughably huge. But this is only because this laptop has 1G of ram, and very strange behaviour occurs this near the threshold of having too little ram to use this feature. Though this is still an argument in favor of turboLiveInst, in that somehow it found itself on the better side of the threshold. I would expect that with 2G of ram, the benefit would be on the order of 35-50% speedup, as the main thing masking the benefit in the cdrom case, is the slow access to install media, which hides the benefit of cutting the needed disk writes nearly in half.

The secondary benefit of turboLiveInst is that it removes the artificial limitation that the target rootfs must be greater than 4.0G, instead of the 2.1G actual uncompressed size of the contents of the LiveCD.

Jeremy has pushed back against this patch because of complexity. Hopefully this round of polishing will make the patch much easier to read and understand. In addition, since the cleanupDeleted patch which this one depends on has already been merged, that should also make this a bit more palatable.

Jeremy also brought up the idea of doing a file level copy installation, rather than the current block level mechanism. This _is_ a good idea, in my opinion, in that it will more intelligently support situations such as a separate /usr filesystem. In addition there is no way that turboLiveInst or the existing block level mechanism can be made to support xfs or other non-ext3 destination filesystems chosen by the user (should those options return).

But- I would argue that file level installations may suffer badly due to cdrom seeking. I would also argue that there is no reason why turboLiveInst, could not be the first choice for installation technique, with fallbacks to file level copies for the seperate /usr and xfs type scenarios.

Ultimately I just hope that turboLiveInst gets serious consideration for F8, via performance comparison with whatever other options may exist.

Finally, here are some notes on the architecture, which may help you to understand the code-

As I mentioned in the original patch, the basic idea is to simply, at livecd-creator build time, use a device mapper snapshot to generate a delta file between the 4.0G filesystem housing 2.1G of data, and a 2.1G filesystem holding the same data. Then at anaconda/liveinst time, that delta file is used to recreate a virtual image of the 2.1G filesystem, so that it can be copied to the installation target, rather than the 4.0G filesystem which includes 1.9G of zeros that needn't be written to disk.

I ended up using /dev/loop118 in the initramfs init (mayflower generated) to expose the delta file. /dev/loop118 was mknod'd already by mayflower, but not actually used for anything. I used it to expose the delta file (osmin.tgz), because /dev/loop121 was being used to expose the 4.0G os.img, and it seemed simplest to use an identical mechanism to expose that data. Because by the time anaconda runs, the original cdrom and squashfs filesystems have been lazy unmounted, a simple cp at that time was not an option(?).

I chose to extract the delta file (osmin, a 16MB sparse file containing 1.2M of data, compressed to 25kb on cdrom) into /dev/shm (i.e. ram) so that reads from it would not try to go to the cdrom. This may not have been necessary.

I chose to calculate the size of the filesystem at livecd-creator time, and include it with osmin as osmin.size(together forming osmin.tgz). This is less complex than what I did in the first pass, which was to use dumpe2fs to calculate it at ananconda time.

As always, questions, comments, criticisms, and especially testers are more than welcome.

peace...

-dmc
diff -Naur anaconda.cvs.20070723/livecd.py anaconda/livecd.py
--- anaconda.cvs.20070723/livecd.py	2007-07-16 19:45:33.000000000 +0000
+++ anaconda/livecd.py	2007-07-27 07:20:21.000000000 +0000
@@ -128,16 +128,21 @@
         return self.osimg
 
     def getLiveSizeMB(self):
-        lnk = os.readlink(self.osimg)
-        if lnk[0] != "/":
-            lnk = os.path.join(os.path.dirname(self.osimg), lnk)
-        blk = os.path.basename(lnk)
-
-        if not os.path.exists("/sys/block/%s/size" %(blk,)):
-            log.debug("Unable to determine the actual size of the live image")
-            return 0
-
-        size = open("/sys/block/%s/size" %(blk,), "r").read()
+        if os.path.exists("/dev/shm/osmin.size"):
+            # turbo-liveinst might expose the real minimal fs size here
+            size = open("/dev/shm/osmin.size", "r").read()
+        else:
+            lnk = os.readlink(self.osimg)
+            if lnk[0] != "/":
+                lnk = os.path.join(os.path.dirname(self.osimg), lnk)
+                blk = os.path.basename(lnk)
+                
+                if not os.path.exists("/sys/block/%s/size" %(blk,)):
+                    log.debug("Unable to determine the actual size of the live image")
+                    return 0
+                
+                size = open("/sys/block/%s/size" %(blk,), "r").read()
+                
         try:
             size = int(size)
         except ValueError:
diff -Naur anaconda.cvs.20070723/liveinst/liveinst.sh anaconda/liveinst/liveinst.sh
--- anaconda.cvs.20070723/liveinst/liveinst.sh	2007-04-04 18:05:42.000000000 +0000
+++ anaconda/liveinst/liveinst.sh	2007-07-27 07:18:43.000000000 +0000
@@ -4,7 +4,19 @@
 #
 
 if [ -z "$LIVE_BLOCK" ]; then
-    LIVE_BLOCK="/dev/live-osimg"
+
+    # turbo-liveinst: if minimized dm-snapshot-delta data exists, use it
+    #                 to construct a better live-osimg to use.
+    #
+    # did mayflower find and expose the delta data for us via loop118?
+    if ( losetup /dev/loop118 > /dev/null 2>&1 ); then
+	tar --sparse --directory /dev/shm -xf /dev/loop118
+	losetup /dev/loop117 /dev/shm/osmin
+	echo "0 $( blockdev --getsize /dev/loop121 ) snapshot /dev/loop121 /dev/loop117 p 8" | dmsetup create live-osimg-min
+	LIVE_BLOCK="/dev/mapper/live-osimg-min"
+    else
+	LIVE_BLOCK="/dev/live-osimg"
+    fi
 fi
 
 if [ ! -b $LIVE_BLOCK ]; then
@@ -42,3 +54,10 @@
 if [ -n $current ]; then
     /usr/sbin/setenforce $current
 fi
+
+# cleanup turbo-liveinst if needed
+if ( losetup /dev/loop118 > /dev/null 2>&1 ); then
+    dmsetup remove live-osimg-min
+    losetup -d /dev/loop117
+    rm -f /dev/shm/osmin /dev/shm/osmin.size
+fi
diff -Naur livecd.git.20070724/creator/isotostick.sh livecd/creator/isotostick.sh
--- livecd.git.20070724/creator/isotostick.sh	2007-07-25 01:23:24.000000000 +0000
+++ livecd/creator/isotostick.sh	2007-07-25 08:38:06.000000000 +0000
@@ -179,6 +179,9 @@
 elif [ -f $CDMNT/ext3fs.img ]; then
     cp $CDMNT/ext3fs.img $USBMNT/LiveOS/ext3fs.img || exitclean 
 fi
+if [ -f $CDMNT/osmin.tgz ]; then
+    cp $CDMNT/osmin.tgz $USBMNT/LiveOS/osmin.tgz || exitclean 
+fi
 cp $CDMNT/isolinux/* $USBMNT/$SYSLINUXPATH
 
 echo "Updating boot config file"
diff -Naur livecd.git.20070724/creator/livecd-creator livecd/creator/livecd-creator
--- livecd.git.20070724/creator/livecd-creator	2007-07-25 01:23:24.000000000 +0000
+++ livecd/creator/livecd-creator	2007-07-26 04:40:02.000000000 +0000
@@ -287,7 +287,7 @@
         return self.runTransaction(cb)
 
 class InstallationTarget:
-    def __init__(self, repos, packages, epackages, groups, fs_label, skip_compression, skip_prelink,tmpdir):
+    def __init__(self, repos, packages, epackages, groups, fs_label, skip_compression, turbo_liveinst, skip_prelink,tmpdir):
         self.ayum = None
         self.repos = repos
         self.packages = packages
@@ -295,6 +295,7 @@
         self.groups = groups
         self.fs_label = fs_label
         self.skip_compression = skip_compression
+        self.turbo_liveinst = turbo_liveinst
         self.skip_prelink = skip_prelink
         self.tmpdir = tmpdir
 
@@ -302,6 +303,7 @@
         self.instloop = None
         self.bindmounts = []
         self.ksparser = None
+        self.minsizekb = 0
         
     def parse(self, kscfg):
         ksversion = pykickstart.version.makeVersion()
@@ -932,7 +934,7 @@
         else:
             shutil.move("%s/data/os.img" %(self.build_dir,),
                         "%s/out/ext3fs.img" %(self.build_dir,))
-
+            
     #
     # cleanupDeleted removes unused data from the sparse ext3 os image file.
     # The process involves: resize2fs-to-minimal, truncation,
@@ -974,6 +976,9 @@
 
         print >> sys.stderr, "Installation target minimized to %dK" % (size_top * 4)
 
+        # save minimized size for reuse by turboLiveInst
+        self.minsizekb = size_top * 4L
+  
         # truncate the unused excess portion of the sparse file
         fd = os.open("%s/data/os.img" %(self.build_dir,), os.O_WRONLY )
         os.ftruncate(fd, size_top * 4096L)
@@ -985,6 +990,71 @@
                                            cwd="%s/data" %(self.build_dir,),
                                            env={"PWD": "%s/data" %(self.build_dir,)})
 
+
+    #
+    # turboLiveInst: generates an osmin overlay file to sit alongside
+    #                os.img.  liveinst may then detect the existence of
+    #                osmin, and use it to create a minimized os.img
+    #                which can be installed more quickly, and to smaller
+    #                destination volumes.
+    #
+    def turboLiveInst(self, image_size):
+        # create the sparse file for the minimized overlay
+        fd = os.open("%s/out/osmin" %(self.build_dir,),
+                     os.O_WRONLY | os.O_CREAT)
+        off = long(16L * 1024L * 1024L)
+        os.lseek(fd, off, 0)
+        os.write(fd, '\x00')
+        os.close(fd)
+        
+        # associate os image with loop device
+        osloop = LoopbackMount("%s/data/os.img" %(self.build_dir,),
+                               "not_going_to_actually_get_mounted")
+        osloop.loopsetup()
+        
+        # associate overlay with loop device
+        minloop = LoopbackMount("%s/out/osmin" %(self.build_dir,),
+                                "not_going_to_actually_get_mounted")
+        minloop.loopsetup()
+        
+        # create a snapshot device
+        rc = subprocess.call(["/sbin/dmsetup",
+                              "--table",
+                              "0 %d snapshot %s %s p 8"
+                              %(image_size * 1024L * 2L,
+                                osloop.loopdev, minloop.loopdev),
+                              "create",
+                              "livecd-creator-%d" %(os.getpid(),) ])
+        if rc != 0:
+            raise InstallationError("Could not create turboLiveInst snapshot device")
+        # resize snapshot device back to minimal (self.minsizekb)
+        rc = subprocess.call(["/sbin/resize2fs",
+                              "/dev/mapper/livecd-creator-%d" %(os.getpid(),),
+                              "%dK" %(self.minsizekb,)])
+
+        # tear down snapshot and loop devices
+        rc = subprocess.call(["/sbin/dmsetup", "remove",
+                              "livecd-creator-%d" %(os.getpid(),) ])
+        if rc != 0:
+            raise InstallationError("Could not remove turboLiveInst snapshot device")
+        osloop.lounsetup()
+        minloop.lounsetup()
+
+        # save minsize (in 512 byte sectors) to a textfile so it needn't
+        # be computed at liveinst time.
+        minsizefile = open(self.build_dir + "/out/osmin.size", "w")
+        minsizefile.write("%d\n" % ( self.minsizekb * 2))
+        minsizefile.close()
+                    
+        # package osmin and osmin.size together and compressed
+        rc = subprocess.call(["/bin/tar", "--sparse", "-cvzf", "osmin.tgz",
+                              "osmin", "osmin.size"],
+                             cwd="%s/out" %(self.build_dir,),
+                             env={"PWD": "%s/out" %(self.build_dir,)})
+
+        os.unlink(self.build_dir + "/out/osmin")
+        os.unlink(self.build_dir + "/out/osmin.size")
+
     def package(self):
         self.createSquashFS()
         self.createIso()
@@ -1001,6 +1071,7 @@
                       [--skip-compression]
                       [--uncompressed-size=<size-in-MB>]
                       [--ignore-deleted]
+                      [--turbo-liveinst]
                       [--shell]
                       [--tmpdir=<tmpdir>]
 
@@ -1015,6 +1086,8 @@
  --prelink           : Prelink the image
  --uncompressed-size : Size of uncompressed fs in MB (default: 4096)
  --ignore-deleted    : Don't run resize2fs to clean up wasted blocks
+ --turbo-liveinst    : Create a small minimized fs image overlay file to
+                           by used by liveinst to improve fs copy speed
  --shell             : Start a shell in the chroot for post-configuration
  --tmpdir            : Temporary directory to use (default: /var/tmp)
 
@@ -1043,6 +1116,7 @@
         self.base_on = None
         self.kscfg = None
         self.skip_compression = False
+        self.turbo_liveinst = False
         self.skip_prelink = True
         self.uncompressed_size = 4096
         self.ignore_deleted = False
@@ -1055,8 +1129,9 @@
                                    ["help", "repo=", "base-on=", "package=",
                                     "exclude-package=", "fslabel=", "config=",
                                     "skip-compression", "uncompressed-size=",
-                                    "ignore-deleted", "shell", "no-prelink",
-                                    "prelink", "tmpdir="])
+                                    "ignore-deleted", "turbo-liveinst",
+                                    "shell", "no-prelink", "prelink",
+                                    "tmpdir="])
 
     except getopt.GetoptError, msg:
         raise Usage(msg)
@@ -1084,6 +1159,9 @@
         if o in ("--ignore-deleted",):
             options.ignore_deleted = True
             continue
+        if o in ("--turbo-liveinst",):
+            options.turbo_liveinst = True
+            continue
         if o in ("-c", "--config"):
             options.kscfg = a
             if not os.path.isfile(options.kscfg):
@@ -1126,6 +1204,9 @@
     if not options.kscfg and not options.repos:
         raise Usage("No repositories specified")
 
+    if options.turbo_liveinst and options.ignore_deleted:
+        raise Usage("turbo-liveinst can not be used with ignore-deleted")
+
     return options
 
 def main():
@@ -1153,6 +1234,7 @@
                                 options.groups,
                                 options.fs_label,
                                 options.skip_compression,
+                                options.turbo_liveinst,
                                 options.skip_prelink,
                                 options.tmpdir)
 
@@ -1173,6 +1255,9 @@
         if not options.ignore_deleted:
             target.cleanupDeleted()
 
+        if options.turbo_liveinst:
+            target.turboLiveInst(options.uncompressed_size)
+
         target.package()
     except InstallationError, e:
         print >> sys.stderr, "Error creating Live CD : %s" % e
diff -Naur livecd.git.20070724/creator/mayflower livecd/creator/mayflower
--- livecd.git.20070724/creator/mayflower	2007-07-25 01:23:24.000000000 +0000
+++ livecd/creator/mayflower	2007-07-26 04:09:55.000000000 +0000
@@ -605,6 +605,26 @@
     mount -n -o ro,remount /sysroot
 }
 
+# we might have a turboLiveInst delta file for anaconda/liveinst to take advantage of
+#
+
+modprobe loop max_loop=128
+
+if [ -e /sysroot/LiveOS/osmin.tgz ]; then
+  mknod /dev/loop118 b 7 118
+  # osmin.tgz should only be about 25kb.  mainly this is in case of live_ram
+  dd if=/sysroot/LiveOS/osmin.tgz of=/osmin.tgz bs=512 2> /dev/null
+  # loop devices round down to the nearest 512 byte sector size, how nice...
+  dd if=/dev/zero of=/osmin.tgz bs=512 count=1 oflag=append conv=notrunc 2> /dev/null
+  losetup /dev/loop118 /osmin.tgz
+elif [ -e /sysroot/osmin.tgz ] ; then
+  mknod /dev/loop118 b 7 118
+  dd if=/sysroot/osmin.tgz of=/osmin.tgz bs=512 2> /dev/null
+  # loop devices round down to the nearest 512 byte sector size, how nice...
+  dd if=/dev/zero of=/osmin.tgz bs=512 count=1 oflag=append conv=notrunc 2> /dev/null
+  losetup /dev/loop118 /osmin.tgz
+fi
+
 # we might have an uncompressed embedded ext3  to use as rootfs (uncompressed live)
 #
 if [ -e /sysroot/LiveOS/ext3fs.img ]; then
@@ -618,13 +638,11 @@
         echo "setting up embedded ext3 fs "
     fi
 
-    mknod /dev/loop118 b 7 118
     mknod /dev/loop119 b 7 119
     mknod /dev/loop120 b 7 120
     mknod /dev/loop121 b 7 121
     mkdir -p /dev/mapper
     mknod /dev/mapper/control c 10 63
-    modprobe loop max_loop=128
     modprobe dm_snapshot
 
     losetup /dev/loop121 \$EXT3FS
@@ -647,13 +665,11 @@
         echo "setting up embedded squash -> ext3 fs "
     fi
 
-    mknod /dev/loop118 b 7 118
     mknod /dev/loop119 b 7 119
     mknod /dev/loop120 b 7 120
     mknod /dev/loop121 b 7 121
     mkdir -p /dev/mapper
     mknod /dev/mapper/control c 10 63
-    modprobe loop max_loop=128
     modprobe dm_snapshot
 
     if [ "\$live_ram" == "1" ] ; then

[Index of Archives]     [Kickstart]     [Fedora Users]     [Fedora Legacy List]     [Fedora Maintainers]     [Fedora Desktop]     [Fedora SELinux]     [Big List of Linux Books]     [Yosemite News]     [Yosemite Photos]     [KDE Users]     [Fedora Tools]
  Powered by Linux