Re: Changes to copyfile functionality.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



        Attached are my original patch and a git-patch which contains the same 
code.  These implement  changes I called #1 in my previous e-mail.

        #2 doesn't seem too hard.  CopyFile.open will have to open a tempfile 
rather than the target file.  It already returns the filepath on success.  
Append already uses the return value of CopyFile.open as the file to append 
to.  After appending has finished there needs to be a CopyFile.move function 
called by the overlord which takes both paths and moves the tempfile over the 
source file.  Making sure the ownership/permissions/SELinux attributes of the 
target file do not change could be is important so had better be careful about 
that.

        Once this is in place #3 is almost trivial.  The overlord needs to 
calculate the MD5 of the source file.  The code for doing this is also needed 
by #1 and is included in my patches.  That MD5 should be passed to the 
CopyFile.move function.  It should calculate the MD5 of the tempfile and if 
they don't match just delete the tempfile and return an error instead of 
moving it.  I figure that this MD5 check should be optional so I will also 
create a flag to enable it.

        Can anyone see any issues with this general design?  If not I'll get 
working on it when I have the time.

--
Marcus



On Friday, August 24, 2012 07:32:14 PM you wrote:
> Copyfile could most certainly use some TLC, so let me know if I can help,
> that would certainly be of help if you'd like to work on that.  The list
> has indeed changed to lists.fedorahosted.com, so definitely use that one.
> 
> Would you mind shooting me your original patch as well?  Thanks very much!
> 
> On Fri, Aug 24, 2012 at 7:22 PM, Marcus Lauer <marcuslauer@xxxxxxxxxx>wrote:
> >          It seems to me that there are at least three
> >          improvements which
> > 
> > should be made to copyfile:
> > 
> > 1. If the file on the overlord and minion are the same, that file should
> > not
> > be copied.  This saves bandwidth/processing time.
> > 
> > 2. Overwriting the file on the minion should be atomic, e.g. the file on
> > the
> > overlord should be copied to a tempfile on the minion then moved to
> > overwrite
> > the target file as a final step.  Right now if funcd dies or something
> > goes wrong mid-copy you end up with half a file on the minion.
> > 
> > 3. There should be an option (off by default) such that if the file
> > which
> > has
> > been copied over to the minion is not identical to the file on the
> > overlord it
> > is re-copied, or at least does not overwrite the file on the minion. 
> > This won't work for some files (busy logfiles) but for most files it
> > seems like an
> > extra bit of reliability which most people would want in most
> > situations.
> > 
> >         I submitted a patch (though not a git-patch... my bad!)
> >         back in
> > 
> > January 2011 which implements #1, but imperfectly.  The overlord still
> > has to
> > send a (small) packet to the minion for each 60KB chunk of target file.
> > 
> >        Would anyone be bothered terribly if I worked on #2 and
> >        #3?  While
> > 
> > I am
> > at it I will try to find a way to fix up my patch for #1 as well.  There
> > is some overlap between these patches.  Both will involve calculating
> > an MD5 hash
> > of the file on the overlord, for example.
> > 
> >         P.S. Is the new mailing list on fedorahosted.org live
> >         yet?  I am
> > 
> > sending this message to both just in case.
> > 
> > --
> > Marcus
> > _______________________________________________
> > func mailing list
> > func@xxxxxxxxxxxxxxxxxxxxxx
> > https://lists.fedorahosted.org/mailman/listinfo/func
> 
> --
> Steve Salevan
> steve@xxxxxxxxxx
diff -rupN func/minion/modules/copyfile.py func.new/minion/modules/copyfile.py
--- func/minion/modules/copyfile.py	2011-01-01 11:45:52.000000000 -0500
+++ func.new/minion/modules/copyfile.py	2011-01-01 11:48:11.000000000 -0500
@@ -26,6 +26,7 @@ import time
 import shutil
 
 import func_module
+from func import utils as func_utils
 
 
 class CopyFile(func_module.FuncModule):
@@ -40,28 +41,39 @@ class CopyFile(func_module.FuncModule):
         return thissum.hexdigest()
 
     def checksum(self, thing):
-
         CHUNK=2**16
         thissum = hashlib.new('sha1')
         if os.path.exists(thing):
-            fo = open(thing, 'r', CHUNK)
-            chunk = fo.read
-            while chunk:
+            fo = open(thing, 'r')
+            while True:
                 chunk = fo.read(CHUNK)
+                if not chunk:
+                    break
                 thissum.update(chunk)
             fo.close()
             del fo
         else:
-            # assuming it's a string of some kind
+	    # assuming it's a string of some kind
             thissum.update(thing)
 
-        return thissum.hexdigest()
+        hexdig = thissum.hexdigest()
+        return hexdig
 
-    def open(self, filepath, mode=None, uid=-1, gid=-1):
+    def open(self, filepath, remote_sum, mode=None, uid=-1, gid=-1, backup=False, force=False):
         dirpath = os.path.dirname(filepath)
         if not os.path.exists(dirpath):
             os.makedirs(dirpath)
 
+        if os.path.exists(filepath):
+	    if not force:
+                local_sum = self.checksum(filepath)
+                if remote_sum == local_sum:
+                    return 0
+
+	    if backup:
+                if not self._backuplocal(filepath):
+                    return -1
+
         # Create empty file
         try:
             fo = open(filepath, 'w')
@@ -82,7 +94,14 @@ class CopyFile(func_module.FuncModule):
 
         return filepath
 
-    def append(self, filepath, filebuf):
+    def append(self, open_result, filebuf):
+        hostname = func_utils.get_hostname_by_route()
+        filepath = open_result[hostname]
+
+        # If the result of the open function was 0 (or -1), do not append, return the same value.
+	if type(filepath) == type(0):
+	    return filepath
+
         if not os.path.exists(filepath):
             # file disaperead
             return -1
diff -rupN func/overlord/cmd_modules/copyfile.py func.new/overlord/cmd_modules/copyfile.py
--- func/overlord/cmd_modules/copyfile.py	2011-01-01 11:45:52.000000000 -0500
+++ func.new/overlord/cmd_modules/copyfile.py	2011-01-01 11:48:11.000000000 -0500
@@ -34,6 +34,8 @@ class CopyFile(base_command.BaseCommand)
                                action="store")
         self.parser.add_option("", "--remotepath", dest="remotepath",
                                 action="store")
+        self.parser.add_option("", "--backup", dest="backup",
+                               action="store_true")
         self.parser.add_option("", "--force", dest="force",
                                action="store_true")
         self.parser.add_option("-v", "--verbose", dest="verbose",
@@ -50,4 +52,4 @@ class CopyFile(base_command.BaseCommand)
         self.server_spec = self.parentCommand.server_spec
         self.getOverlord()
 
-        return self.overlord_obj.local.copyfile.send(self.options.filename, self.options.remotepath)
+        return self.overlord_obj.local.copyfile.send(self.options.filename, self.options.remotepath, self.options.backup, self.options.force)
diff -rupN func/overlord/modules/copyfile.py func.new/overlord/modules/copyfile.py
--- func/overlord/modules/copyfile.py	2011-01-01 12:08:48.000000000 -0500
+++ func.new/overlord/modules/copyfile.py	2011-01-01 12:10:49.000000000 -0500
@@ -7,7 +7,7 @@ import xmlrpclib
 from func.overlord import overlord_module
 
 class copyfile(overlord_module.BaseModule):
-    def send(self, localpath, remotepath, bufsize=60000):
+    def send(self, localpath, remotepath, backup=None, force=None, bufsize=60000):
         try:
             f = open(localpath, "r")
         except IOError, e:
@@ -19,12 +19,19 @@ class copyfile(overlord_module.BaseModul
         uid = st.st_uid
         gid = st.st_gid
 
-        self.parent.run("copyfile", "open", [remotepath, mode, uid, gid])
+	if force:
+            local_sum = -1
+        else:
+            import func.minion.modules.copyfile as CopyFile
+            cf = CopyFile.CopyFile()
+            local_sum = cf.checksum(localpath)
+
+        open_result = self.parent.run("copyfile", "open", [remotepath, local_sum, mode, uid, gid, backup, force])
 
         while True:
             data=f.read(bufsize)
             if data:
-                self.parent.run("copyfile", "append", [remotepath, xmlrpclib.Binary(data)])
+                self.parent.run("copyfile", "append", [open_result, xmlrpclib.Binary(data)])
             else:
                 break
 
>From b05f6e4f93769b5082bba25d3bcba6c6f6abcd43 Mon Sep 17 00:00:00 2001
From: Marcus Lauer <marcuslauer@xxxxxxxxxx>
Date: Sat, 25 Aug 2012 16:27:13 -0400
Subject: [PATCH] Copyfile should check MD5 of target and source and not copy
 if identical

---
 func/minion/modules/copyfile.py       |   35 +++++++++++++++++++++++++-------
 func/overlord/cmd_modules/copyfile.py |    4 ++-
 func/overlord/modules/copyfile.py     |   13 +++++++++--
 3 files changed, 40 insertions(+), 12 deletions(-)

diff --git a/func/minion/modules/copyfile.py b/func/minion/modules/copyfile.py
index 61c6a4f..e5bf5da 100644
--- a/func/minion/modules/copyfile.py
+++ b/func/minion/modules/copyfile.py
@@ -26,6 +26,7 @@ import time
 import shutil
 
 import func_module
+from func import utils as func_utils
 
 
 class CopyFile(func_module.FuncModule):
@@ -40,28 +41,39 @@ class CopyFile(func_module.FuncModule):
         return thissum.hexdigest()
 
     def checksum(self, thing):
-
         CHUNK=2**16
         thissum = hashlib.new('sha1')
         if os.path.exists(thing):
-            fo = open(thing, 'r', CHUNK)
-            chunk = fo.read
-            while chunk:
+            fo = open(thing, 'r')
+            while True:
                 chunk = fo.read(CHUNK)
+                if not chunk:
+                    break
                 thissum.update(chunk)
             fo.close()
             del fo
         else:
-            # assuming it's a string of some kind
+	    # assuming it's a string of some kind
             thissum.update(thing)
 
-        return thissum.hexdigest()
+        hexdig = thissum.hexdigest()
+        return hexdig
 
-    def open(self, filepath, mode=None, uid=-1, gid=-1):
+    def open(self, filepath, remote_sum, mode=None, uid=-1, gid=-1, backup=False, force=False):
         dirpath = os.path.dirname(filepath)
         if not os.path.exists(dirpath):
             os.makedirs(dirpath)
 
+        if os.path.exists(filepath):
+	    if not force:
+                local_sum = self.checksum(filepath)
+                if remote_sum == local_sum:
+                    return 0
+
+	    if backup:
+                if not self._backuplocal(filepath):
+                    return -1
+
         # Create empty file
         try:
             fo = open(filepath, 'w')
@@ -82,7 +94,14 @@ class CopyFile(func_module.FuncModule):
 
         return filepath
 
-    def append(self, filepath, filebuf):
+    def append(self, open_result, filebuf):
+        hostname = func_utils.get_hostname_by_route()
+        filepath = open_result[hostname]
+
+        # If the result of the open function was 0 (or -1), do not append, return the same value.
+	if type(filepath) == type(0):
+	    return filepath
+
         if not os.path.exists(filepath):
             # file disaperead
             return -1
diff --git a/func/overlord/cmd_modules/copyfile.py b/func/overlord/cmd_modules/copyfile.py
index 3f2263d..a5dda6c 100644
--- a/func/overlord/cmd_modules/copyfile.py
+++ b/func/overlord/cmd_modules/copyfile.py
@@ -34,6 +34,8 @@ class CopyFile(base_command.BaseCommand):
                                action="store")
         self.parser.add_option("", "--remotepath", dest="remotepath",
                                 action="store")
+        self.parser.add_option("", "--backup", dest="backup",
+                               action="store_true")
         self.parser.add_option("", "--force", dest="force",
                                action="store_true")
         self.parser.add_option("-v", "--verbose", dest="verbose",
@@ -50,4 +52,4 @@ class CopyFile(base_command.BaseCommand):
         self.server_spec = self.parentCommand.server_spec
         self.getOverlord()
 
-        return self.overlord_obj.local.copyfile.send(self.options.filename, self.options.remotepath)
+        return self.overlord_obj.local.copyfile.send(self.options.filename, self.options.remotepath, self.options.backup, self.options.force)
diff --git a/func/overlord/modules/copyfile.py b/func/overlord/modules/copyfile.py
index 9f385a9..f13fc9b 100644
--- a/func/overlord/modules/copyfile.py
+++ b/func/overlord/modules/copyfile.py
@@ -7,7 +7,7 @@ import xmlrpclib
 from func.overlord import overlord_module
 
 class copyfile(overlord_module.BaseModule):
-    def send(self, localpath, remotepath, bufsize=60000):
+    def send(self, localpath, remotepath, backup=None, force=None, bufsize=60000):
         try:
             f = open(localpath, "r")
         except IOError, e:
@@ -19,12 +19,19 @@ class copyfile(overlord_module.BaseModule):
         uid = st.st_uid
         gid = st.st_gid
 
-        self.parent.run("copyfile", "open", [remotepath, mode, uid, gid])
+	if force:
+            local_sum = -1
+        else:
+            import func.minion.modules.copyfile as CopyFile
+            cf = CopyFile.CopyFile()
+            local_sum = cf.checksum(localpath)
+
+        open_result = self.parent.run("copyfile", "open", [remotepath, local_sum, mode, uid, gid, backup, force])
 
         while True:
             data=f.read(bufsize)
             if data:
-                self.parent.run("copyfile", "append", [remotepath, xmlrpclib.Binary(data)])
+                self.parent.run("copyfile", "append", [open_result, xmlrpclib.Binary(data)])
             else:
                 break
 
-- 
1.7.7
_______________________________________________
func mailing list
func@xxxxxxxxxxxxxxxxxxxxxx
https://lists.fedorahosted.org/mailman/listinfo/func

[Index of Archives]     [Fedora Users]     [Linux Networking]     [Fedora Legacy List]     [Fedora Desktop]     [Fedora SELinux]     [Big List of Linux Books]     [Yosemite News]     [KDE Users]

  Powered by Linux