Our Perforce server experienced some kind of database corruption a few years ago. While the file data and revision history are mostly intact, some metadata for several changesets got lost. For example, inspecting certain changelists produces errors. """ $ p4 describe -s 12345 Date 2019/02/26 16:46:17: Operation: user-describe Operation 'user-describe' failed. Change 12345 description missing! """ While some metadata (like changeset descriptions) is obviously lost, most of it can be reconstructed via other commands: * `p4 changes -l -t //...@12345,12345` -- to obtain date+time, author, beginning of changeset description; * `p4 files -a //...@12345,12345` -- to obtain file revisions, file types, file actions; * `p4 diff2 -u //...@12344 //...@12345` -- to get a unified diff of text files in a changeset; * `p4 print -o binary.blob@12345 //depot/binary.blob@12345` -- to get a revision of a binary file. It might be possible to teach git-p4 to fallback to other methods if `p4 describe` fails, but it's probably too special-cased (really depends on kind and scale of DB corruption), so some manual intervention is perhaps acceptable. So, with some manual work, it's possible to reconstruct `p4 -G describe ...` output manually. In our case, once git-p4 passes `p4 describe` stage, it can proceed further just fine. Thus, it's tempting to feed resurrected metadata to git-p4 when a normal `p4 describe` would fail. This functionality may be useful to cache changelist information, or to make some changes to changelist info before feeding it to git-p4. A new config parameter is introduced to tell git-p4 to load certain changelist descriptions from files instead of from a server. For simplicity, it's one pickled file per changelist. ``` git config --add git-p4.damagedChangelists 12345.pickled git config --add git-p4.damagedChangelists 12346.pickled ``` The following trivial script may be used to produce pickled `p4 -G describe`-compatible output. """ #!/usr/bin/python2 import pickle import time # recovered commits of interest changes = [ { 'change': '12345', 'status': 'submitted', 'code': 'stat', 'user': 'username1', 'time': str(int(time.mktime(time.strptime('2019/02/28 16:00:30', '%Y/%m/%d %H:%M:%S')))), 'client': 'username1_hostname1', 'desc': 'A bug is fixed.\nDetails are below:<lost>\n', 'depotFile0': '//depot/branch1/foo.sh', 'action0': 'edit', 'rev0': '28', 'type0': 'xtext', 'depotFile1': '//depot/branch1/bar.py', 'action1': 'edit', 'rev1': '43', 'type1': 'text', 'depotFile2': '//depot/branch1/baz.doc', 'action2': 'edit', 'rev2': '8', 'type2': 'binary', 'depotFile3': '//depot/branch1/qqq.c', 'action3': 'edit', 'rev3': '6', 'type3': 'ktext', }, ] for change in changes: pickle.dump(change, open('{0}.pickled'.format(change['change']), 'wb')) """ Signed-off-by: Andrey Mazo <amazo@xxxxxxxxxxxxxx> --- Notes: Documentation changes and tests are obviously missing, but I hoped to get some feedback on the idea overall before working on those. git-p4.py | 25 ++++++++++++++++++++++++- 1 file changed, 24 insertions(+), 1 deletion(-) diff --git a/git-p4.py b/git-p4.py index 40bc84573b..3133419280 100755 --- a/git-p4.py +++ b/git-p4.py @@ -24,10 +24,11 @@ import stat import zipfile import zlib import ctypes import errno +import pickle # support basestring in python3 try: unicode = unicode except NameError: @@ -2615,10 +2616,12 @@ def __init__(self): self.knownAlienLabelBranches = {} self.tz = "%+03d%02d" % (- time.timezone / 3600, ((- time.timezone % 3600) / 60)) self.labels = {} + self.damagedChangelists = {} + # Force a checkpoint in fast-import and wait for it to finish def checkpoint(self): self.gitStream.write("checkpoint\n\n") self.gitStream.write("progress checkpoint\n\n") out = self.gitOutput.readline() @@ -3312,10 +3315,25 @@ def getAlienLabelBranchMapping(self): for mapping in alienLabelBranches: if mapping: (alien, ours) = mapping.split(":") self.knownAlienLabelBranches[alien] = ours + def loadDamagedChangelists(self): + damagedChangelists = gitConfigList("git-p4.damagedChangelists") + for clPickled in damagedChangelists: + if not clPickled: + continue + + try: + clDesc = pickle.load(open(clPickled, 'rb')) + if not ("status" in clDesc and "user" in clDesc and "time" in clDesc and "change" in clDesc): + die("Changelist description read from {0} doesn't have required fields".format(clPickled)) + except (IOError, TypeError) as e: + die("Can't read changelist description dict from {0}: {1}".format(clPickled, str(e))) + + self.damagedChangelists[int(clDesc["change"])] = clDesc + def updateOptionDict(self, d): option_keys = {} if self.keepRepoPath: option_keys['keepRepoPath'] = 1 @@ -3413,11 +3431,14 @@ def searchParent(self, parent, branch, target): return None def importChanges(self, changes, origin_revision=0): cnt = 1 for change in changes: - description = p4_describe(change) + if change in self.damagedChangelists: + description = self.damagedChangelists[change] + else: + description = p4_describe(change) self.updateOptionDict(description) if not self.silent: sys.stdout.write("\rImporting revision %s (%s%%)" % (change, cnt * 100 / len(changes))) sys.stdout.flush() @@ -3704,10 +3725,12 @@ def run(self, args): bad_changesfile = True break if bad_changesfile: die("Option --changesfile is incompatible with revision specifiers") + self.loadDamagedChangelists() + newPaths = [] for p in self.depotPaths: if p.find("@") != -1: atIdx = p.index("@") self.changeRange = p[atIdx:] -- 2.19.2