Implements the import of objects from a local or remote CVS repository. This helper program uses the "git_remote_cvs" Python package introduced earlier, and provides a working draft implementation of the remote helper API, as described in Documentation/git-remote-helpers.txt. Further details about this specific helper are described in the new Documentation/git-remote-cvs.txt. This patch has been improved by the following contributions: - Daniel Barkalow: Updates reflecting changes in remote helper API Signed-off-by: Johan Herland <johan@xxxxxxxxxxx> --- Documentation/git-remote-cvs.txt | 85 +++++ Makefile | 24 ++ git-remote-cvs.py | 697 ++++++++++++++++++++++++++++++++++++++ 3 files changed, 806 insertions(+), 0 deletions(-) create mode 100644 Documentation/git-remote-cvs.txt create mode 100755 git-remote-cvs.py diff --git a/Documentation/git-remote-cvs.txt b/Documentation/git-remote-cvs.txt new file mode 100644 index 0000000..783d542 --- /dev/null +++ b/Documentation/git-remote-cvs.txt @@ -0,0 +1,85 @@ +git-remote-cvs(1) +============== + +NAME +---- +git-remote-cvs - Helper program for interoperation with CVS repositories + +SYNOPSIS +-------- +'git remote-cvs' <remote> + +DESCRIPTION +----------- + +Please see the linkgit:git-remote-helpers[1] documentation for general +information about remote helper programs. + +CONFIGURATION +------------- + +remote.*.cvsRoot:: + The URL of the CVS repository (as found in a `CVSROOT` variable, or + in a `CVS/Root` file). + Example: "`:pserver:user@server/var/cvs/cvsroot`". + +remote.*.cvsModule:: + The path of the CVS module (as found in a `CVS/Repository` file) + within the CVS repository specified in `remote.*.cvsRoot`. + Example: "`foo/bar`" + +remote.*.cachedSymbolsOnly:: + When 'true', a cache of CVS symbols is used instead of querying the + CVS server for all existing symbols (potentially expensive). In this + mode, git-remote-cvs will not discover new CVS symbols unless you add + them explicitly with the "`addsymbol <symbol>`" command (on + git-remote-cvs's stdin), or request an explicit symbol cache update + from the CVS server with the "`syncsymbols`" command (on + git-remote-cvs's stdin). When 'false' (the default), the CVS server + will be queried whenever a list of CVS symbols is required. + +remote.*.usernameMap:: + The path (absolute, or relative to the repository (NOT the worktree)) + to the file that contains the mapping from CVS usernames to the + corresponding full names and email addresses, as used by Git in the + Author and Committer fields of commit objects. When this config + variable is set, CVS usernames will be resolved against this file. + If no match is found in the file, or if this config variable is unset, + or if the variable points to a non-existing file, the original CVS + username will be used as the Author/Committer name, and the + corresponding email address will be set to "`<username>@example.com`". ++ +The format of the usernameMap file is one entry per line, where each line is +of the form "`username: Full Name <email@address>`". +Example: `johndoe: John Doe <johndoe@xxxxxxxxxxx>` +Blank lines and lines starting with '#' are ignored. + +COMMANDS +-------- + +In addition to the commands that constitute the git-remote-helpers API, the +following extra commands are supported for managing the local symbol cache when +the `remote.*.cachedSymbolsOnly` config variable is true. The following +commands can be given on the standard input of git-remote-cvs: + +'addsymbol':: + Takes one CVS symbol name as argument. The given CVS symbol is + fetched from the CVS server and stored into the local CVS symbol + cache. If `remote.*.cachedSymbolsOnly` is enabled, this can be used + to introduce a new CVS symbol to the CVS helper application. + +'syncsymbols':: + All CVS symbols that are available from the given remote are + fetched from the CVS server and stored into the local CVS symbol + cache. This is equivalent to disabling `remote.{asterisk}.cachedSymbolsOnly`, + running the "list" command, and then finally re-enabling the + `remote.*.cachedSymbolsOnly` config variable. I.e. this command can + be used to manually synchronize the CVS symbols available to the + CVS helper application. + +'verify':: + Takes one CVS symbol name as argument. Verifies that the CVS symbol + has been successfully imported be checking out the CVS symbol from + the CVS server, and comparing the CVS working tree against the Git + tree object identified by `refs/cvs/<remote>/<symbol>`. This can be + used to verify the correctness of a preceding 'import' command. diff --git a/Makefile b/Makefile index bb5cea2..b2af678 100644 --- a/Makefile +++ b/Makefile @@ -350,6 +350,8 @@ SCRIPT_PERL += git-relink.perl SCRIPT_PERL += git-send-email.perl SCRIPT_PERL += git-svn.perl +SCRIPT_PYTHON += git-remote-cvs.py + SCRIPTS = $(patsubst %.sh,%,$(SCRIPT_SH)) \ $(patsubst %.perl,%,$(SCRIPT_PERL)) \ $(patsubst %.py,%,$(SCRIPT_PYTHON)) \ @@ -1474,6 +1476,28 @@ $(patsubst %.perl,%,$(SCRIPT_PERL)) git-instaweb: % : unimplemented.sh mv $@+ $@ endif # NO_PERL +ifndef NO_PYTHON +$(patsubst %.py,%,$(SCRIPT_PYTHON)): % : %.py + $(QUIET_GEN)$(RM) $@ $@+ && \ + INSTLIBDIR=`MAKEFLAGS= $(MAKE) -C git_remote_cvs -s --no-print-directory instlibdir` && \ + sed -e '1{' \ + -e ' s|#!.*python|#!$(PYTHON_PATH_SQ)|' \ + -e '}' \ + -e 's|^import sys.*|&; sys.path.insert(0, "@@INSTLIBDIR@@")|' \ + -e 's|@@INSTLIBDIR@@|'"$$INSTLIBDIR"'|g' \ + $@.py >$@+ && \ + chmod +x $@+ && \ + mv $@+ $@ +else # NO_PYTHON +$(patsubst %.py,%,$(SCRIPT_PYTHON)): % : unimplemented.sh + $(QUIET_GEN)$(RM) $@ $@+ && \ + sed -e '1s|#!.*/sh|#!$(SHELL_PATH_SQ)|' \ + -e 's|@@REASON@@|NO_PYTHON=$(NO_PYTHON)|g' \ + unimplemented.sh >$@+ && \ + chmod +x $@+ && \ + mv $@+ $@ +endif # NO_PYTHON + configure: configure.ac $(QUIET_GEN)$(RM) $@ $<+ && \ sed -e 's/@@GIT_VERSION@@/$(GIT_VERSION)/g' \ diff --git a/git-remote-cvs.py b/git-remote-cvs.py new file mode 100755 index 0000000..1720d4c --- /dev/null +++ b/git-remote-cvs.py @@ -0,0 +1,697 @@ +#!/usr/bin/env python + +"""Usage: git-remote-cvs <remote> [<url>] + +Git remote helper for interacting with CVS repositories + +See git-remote-helpers documentation for details on external interface, usage, +etc. See git-remote-cvs documentation for specific configuration details of +this remote helper. +""" + +# PRINCIPLES: +# ----------- +# - Importing same symbol twice (with no CVS changes in between) should yield +# the exact same Git state (and the second import should have no commits). +# - Importing several CVS symbols pointing to the same state should yield +# corresponding refs pointing to the _same_ commit in Git. +# - Importing a CVS symbol which has received only "regular commits" since +# last import should yield a fast-forward straight line of commits. + +# TODO / KNOWN PROBLEMS: +# ---------------------- +# - Remove cachedSymbolsOnly config variable for now? +# - Author map handling; mapping CVS usernames to Git full name + email address +# - Handle files that have been created AND deleted since the last import +# - How to handle CVS tags vs. CVS branches. Turn CVS tags into Git tags? +# - Better CVS branch handling: When a branch as a super/subset of files/revs +# compared to another branch, find a way to base one branch on the other +# instead of creating parallel lines of development with roughly the same +# commits. +# - Profiling, optimizations... + +import sys, string, os + +from git_remote_cvs.util import * +from git_remote_cvs.cvs import * +from git_remote_cvs.git import * +from git_remote_cvs.cvs_symbol_cache import CvsSymbolCache +from git_remote_cvs.commit_states import CommitStates +from git_remote_cvs.cvs_revision_map import CvsRevisionMap, CvsStateMap +from git_remote_cvs.changeset import build_changesets_from_revs + +class Config (object): + # Author name/email tuple for commits created by this tool + Author = ("git remote-cvs", "invalid@xxxxxxxxxxx") + + # Git remote name + Remote = None + + # CVS symbols are imported into this refs namespace/directory + RefSpace = None + + # Git notes ref, the refname pointing to our git notes + NotesRef = None + + # CVS repository identifier, a 2-tuple (cvs_root, cvs_module), where + # cvs_root is the CVS server/repository URL (as found in $CVSROOT, or + # in a CVS/Root file), and cvs_module is the path to a CVS module + # relative to the CVS repository (as found in a CVS/Repository file) + CvsRepo = (None, None) + + # Path to the git-remote-cvs cache/work directory + # (normally "info/cvs/$remote" within $GIT_DIR) + WorkDir = None + + # If False, the list of CVS symbols will always be retrieved from the + # CVS server using 'cvs rlog'. If True, only the cached symbols within + # the "symbols" subdirectory of WorkDir are consulted. + CachedSymbolsOnly = False + + @classmethod + def init (cls, remote): + """Fetch configurations parameters for the given remote""" + git_config = parse_git_config() + assert git_config["remote.%s.vcs" % (remote)] == "cvs" + + cls.Author = ( + git_config["user.name"], git_config["user.email"]) + cls.Remote = remote + cls.RefSpace = "refs/cvs/%s/" % (remote) + cls.NotesRef = "refs/notes/cvs/%s" % (remote) + cls.CvsRepo = ( + git_config["remote.%s.cvsroot" % (remote)], + git_config["remote.%s.cvsmodule" % (remote)]) + cls.WorkDir = os.path.join(get_git_dir(), "info/cvs", remote) + cls.CachedSymbolsOnly = git_config_bool(git_config.get( + "remote.%s.cachedsymbolsonly" % (remote), "false")) + +def work_path (*args): + """Return the given path appended to git-remote-cvs's cache/work dir""" + return os.path.join(Config.WorkDir, *args) + +def cvs_to_refname (cvsname): + """Return the git ref name for the given CVS symbolic name""" + if cvsname.startswith(Config.RefSpace): # Already converted + return cvsname + return Config.RefSpace + cvsname + +def ref_to_cvsname (refname): + """Return the CVS symbolic name for the given git ref name""" + if refname.startswith(Config.RefSpace): + return refname[len(Config.RefSpace):] + return refname + +def valid_cvs_symbol (symbol): + """Return True iff the given CVS symbol can be imported into Git""" + return valid_git_ref(cvs_to_refname(symbol)) + +def die_usage (msg, *args): + # Use this file's docstring as a usage string + print >>sys.stderr, __doc__ + die(msg, *args) + +def import_cvs_revs (symbol, prev_state, cur_state, progress): + """Import the CVS revisions involved in importing the given CVS symbol + + This method will determine the CVS revisions involved in moving from + the given prev_state to the given cur_state. This includes looking at + revision metadata in CVS, and importing needed blobs from CVS. + + The revision metadata is returned as a 2-level dict of CvsRev objects: + mapping path -> revnum -> CvsRev object. + """ + + # Calculate the revisions involved in moving from prev_state to + # cur_state, and fetch CvsRev objects for these revisions. + progress.pushprefix("Importing CVS revisions: ") + paths = set(prev_state.paths()).union(cur_state.paths()) + num_fetched_revs = 0 # Number of CvsRev objects involved + num_imported_blobs = 0 # Number of blobs actually imported + cvs_revs = {} # path -> revnum -> CvsRev + for i, path in enumerate(sorted(paths)): + progress.pushprefix("(%i/%i) %s: " % (i + 1, len(paths), path)) + progress("") + prev_rev = prev_state.get(path) + cur_rev = cur_state.get(path) + if prev_rev and cur_rev and prev_rev == cur_rev: + # No changes since last import + progress.popprefix() + continue + + # Fetch CvsRev objects for range [path:prev_rev, path:symbol] + path_revs = fetch_revs(path, prev_rev, cur_rev, symbol, + Config.CvsRepo) + if not path_revs: + # Failed to find revs between prev_rev and symbol + if cur_rev: + assert not cur_rev.follows(prev_rev) + # The CVS symbol has been moved/reset since the + # last import in such a way that we cannot + # deduce the history between the last import + # and the current import. + # FIXME: Can we can work around this? + die("CVS symbol %s has been moved/reset from" \ + " %s:%s to %s:%s since the last" \ + " import. This is not supported", + symbol, path, prev_rev, path, cur_rev) + else: + # CVS symbol has been removed from this path. + # We cannot conclusively determine the history + # of this path following prev_rev. + # FIXME: Can we can work around this? + die("CVS symbol %s has been removed from %s" \ + " since the last import. This is not" \ + " supported", symbol, path) + + # OK. We've got the revs in range [prev_rev, symbol] + + # Verify/determine cur_rev + real_cur_rev = max(path_revs.keys()) + if cur_rev: assert cur_rev == real_cur_rev + else: cur_rev = real_cur_rev + + # No need to re-import prev_rev if already imported + if prev_rev: + assert cur_rev.follows(prev_rev) + assert prev_rev in path_revs + del path_revs[prev_rev] + + assert path_revs # There should be more revs than just prev_rev + + # Sanity checks: + # All revs from prev_rev to cur_rev are about to be imported + check_rev = cur_rev + while check_rev and check_rev != prev_rev: + assert check_rev in path_revs + check_rev = check_rev.parent() + # All previous revs have already been imported + check_rev = prev_rev + while check_rev: + assert Globals.CvsRevisionMap.has_rev(path, check_rev) + check_rev = check_rev.parent() + + # Import CVS revisions as Git blobs + j = 0 + for num, rev in sorted(path_revs.iteritems(), reverse = True): + j += 1 + progress("(%i/%i) %s" % (j, len(path_revs), num)) + assert num == rev.num + + # Skip if already imported + if Globals.CvsRevisionMap.has_rev(rev.path, rev.num): + continue + # ...or if rev is a deletion + elif rev.deleted: + continue + + # Import blob for reals + data = Globals.CvsWorkDir.get_revision_data( + rev.path, rev.num) + Globals.GitFastImport.comment( + "Importing CVS revision %s:%s" % ( + rev.path, rev.num)) + mark = Globals.GitFastImport.blob(data) + Globals.CvsRevisionMap.add_blob( + rev.path, rev.num, mark) + num_imported_blobs += 1 + + # Add path_revs to the overall structure of revs to be imported + assert path not in cvs_revs + cvs_revs[path] = path_revs + num_fetched_revs += len(path_revs) + + progress.popprefix() + + progress.popprefix() + progress("Imported %i blobs (reused %i existing blobs)" % ( + num_imported_blobs, num_fetched_revs - num_imported_blobs), + True) + + return cvs_revs + +def advance_state (state, changeset): + """Advance the given state by applying the given changeset""" + # Verify that the given changeset "fits" on top of the given state + for rev in changeset: + prev_num = rev.num.parent() + state_num = state.get(rev.path) + if prev_num is None and state_num is None: + # 'rev' is the first revision of this path being added + state.add(rev.path, rev.num) + elif prev_num and state_num and prev_num == state_num: + if rev.deleted: # rev deletes path from state + state.remove(rev.path, prev_num) + else: # rev follows state's revision of this path + state.replace(rev.path, rev.num) + else: + error("Cannot apply changeset with %s:%s on top of " \ + "CVS state with %s:%s.", + rev.path, changeset[rev.path].num, + rev.path, state.get(rev.path)) + error(" changeset: %s", changeset) + error(" CVS state: \n---\n%s---", state) + die("Failed to apply changeset. Aborting.") + +def revert_state (state, changeset): + """Revert the given state to _before_ the given changeset is applied + + This is the reverse of the above advance_state() function. + """ + for rev in changeset: + prev_num = rev.num.parent() + state_num = state.get(rev.path) + if state_num is None: # revert deletion of file + assert rev.deleted + state.add(rev.path, prev_num) + else: + assert state_num == rev.num + if prev_num is None: # revert addition of file + state.remove(rev.path, rev.num) + else: # regular revert to previous version + state.replace(rev.path, prev_num) + +def import_changesets (ref, changesets, from_state, to_state, progress): + """Apply the given list of Changeset objects to the given ref + + Also verify that the changesets bring us from the given from_state to + the given to_state. + """ + state = from_state + for i, c in enumerate(changesets): + advance_state(state, c) + progress("(%i/%i) Committing %s" % (i + 1, len(changesets), c)) + # Make a git commit from changeset c + commitdata = GitFICommit( + c.author, # TODO: author_map handling + c.author + "@example.com", # TODO: author_map handling + c.date.ts, + c.date.tz_str(), + "".join(["%s\n" % (line) for line in c.message]), + ) + + for rev in c: + p, n = rev.path, rev.num + if rev.deleted: + commitdata.delete(p) + continue + blobname = Globals.CvsRevisionMap.get_blob(p, n) + mode = Globals.CvsRevisionMap.get_mode(p) + if mode is None: # Must retrieve mode from CVS checkout + debug("Retrieving mode info for '%s'" % (p)) + Globals.CvsWorkDir.update(n, [p]) + mode = Globals.CvsWorkDir.get_modeinfo([p])[p] + Globals.CvsRevisionMap.add_path(p, mode) + commitdata.modify(mode, blobname, p) + + commitname = Globals.GitFastImport.commit(ref, commitdata) + Globals.CommitStates.add( + commitname, state, Globals.GitFastImport) + for path, revnum in state: + Globals.CvsRevisionMap.add_commit( + path, revnum, commitname) + assert commitname in Globals.CvsStateMap.get_commits(state) + + assert state == to_state + return len(changesets) + +def import_cvs_symbol (cvs_symbol, progress): + """Import the given CVS symbol from CVS to Git + + Return False if nothing was imported, True otherwise. + """ + progress.pushprefix("%s: " % (cvs_symbol)) + + git_ref = cvs_to_refname(cvs_symbol) + + # Verify that we are asked to import valid git ref names + if not valid_git_ref(git_ref): + progress("Invalid git ref '%s'. Skipping." % (git_ref), True) + progress.popprefix() + return False + + # Retrieve previously imported CVS state + progress("Loading previously imported state of %s..." % (git_ref)) + prev_commit = Globals.GitRefMap.get(git_ref) + prev_state = Globals.CommitStates.get(prev_commit, CvsState()) + + # Retrieve current CVS state of symbol + # Also: At some point we will need mode information for all CVS paths + # (stored in CvsRevisionMap). This information can be added for each + # path on demand (using CvsWorkDir.get_modeinfo()), but doing so may + # be an expensive process. It is much cheaper to load mode information + # for as many paths as possible in a _single_ operation. We do this + # below, by calling CvsRevisionMap.sync_modeinfo_from_cvs() in + # appropriate places + if Config.CachedSymbolsOnly: + progress("Synchronizing local CVS symbol cache for symbol...") + # The symbol cache is likely not up-to-date. Synchronize the + # given CVS symbol explicitly, to make sure we get the version + # current with the CVS server. + Globals.CvsSymbolCache.sync_symbol( + cvs_symbol, Globals.CvsWorkDir, progress) + + # The above method updates the CVS workdir to the current CVS + # version. Hence, now is a convenient time to preload mode info + # from the currently checked-out CVS files. There may be more + # files for which we'll need mode information, but we'll deal + # with those when needed. + progress("Updating path mode info from current CVS checkout.") + Globals.CvsRevisionMap.sync_modeinfo_from_cvs( + Globals.CvsWorkDir) + elif not Globals.CvsRevisionMap: # There is no info for any paths, yet + # Pure optimization: We didn't get to preload all the mode info + # above. Normally, the only alternative is load mode info for + # each path on-demand. However, if our CvsRevisionMap is + # currently empty, that's probably going to be very expensive. + # Therefore, in this case, do an explicit CVS update here, and + # preload mode info for all paths. + progress("Updating CVS checkout to sync path mode info.") + Globals.CvsWorkDir.update(cvs_symbol) + Globals.CvsRevisionMap.sync_modeinfo_from_cvs( + Globals.CvsWorkDir) + + progress("Loading current CVS state...") + try: cur_state = Globals.CvsSymbolCache[cvs_symbol] + except KeyError: + progress("Couldn't find symbol '%s'. Skipping." % (cvs_symbol), + True) + progress.popprefix() + return False + + # Optimization: Check if the previous import of this symbol is still + # up-to-date. If so, there's nothing more to be done. + progress("Checking if we're already up-to-date...") + if cur_state == prev_state: + progress("Already up-to-date. Skipping.", True) + progress.popprefix() + return False + + progress("Fetching CVS revisions...") + cvs_revs = import_cvs_revs(cvs_symbol, prev_state, cur_state, progress) + + # Organize CvsRevs into a chronological list of changesets + progress("Organizing revisions into changesets...") + changesets = build_changesets_from_revs(cvs_revs) + + # When importing a new branch, try to optimize branch start point, + # instead of importing entire branch from scratch + if prev_commit is None: + progress("Finding startpoint for new symbol...") + i = len(changesets) + state = cur_state.copy() + for c in reversed(changesets): + commit = Globals.CvsStateMap.get_exact_commit( + state, Globals.CommitStates) + if commit is not None: + # We have found a commit that exactly matches the state after commit #i (changesets[i - 1]) + Globals.GitFastImport.reset(git_ref, commit) + changesets = changesets[i:] + break + revert_state(state, c) + i -= 1 + + num_changesets = len(changesets) + num_applied = 0 + + + # Apply changesets, bringing git_ref from prev_state to cur_state + if num_changesets: + progress("Importing changesets...") + num_applied = import_changesets(git_ref, changesets, prev_state, + cur_state, progress) + + progress("Imported %i changesets (reused %i existing changesets)" % ( + num_applied, num_changesets - num_applied), True) + progress.popprefix() + return True + +def do_import (*args): + """Do the 'import' command; import refs from a remote""" + if not args: die_usage("'import' takes at least one parameter: ref...") + + progress = ProgressIndicator(" ", sys.stderr) + + cvs_symbols = map(ref_to_cvsname, args) + empty_import = True + + for symbol in cvs_symbols: + if import_cvs_symbol(symbol, progress): + empty_import = False + + if empty_import: + progress.finish("Everything up-to-date", True) + return 0 + + progress.finish("Finished importing %i CVS symbols to Git" % ( + len(cvs_symbols)), True) + return 0 + +def do_list (*args): + """Do the 'list' command; list refs available from a CVS remote""" + if args: die_usage("'list' takes no parameters") + + progress = ProgressIndicator(" ", sys.stderr) + + if Config.CachedSymbolsOnly: + progress("Listing symbols in local symbol cache...", True) + for symbol in sorted(Globals.CvsSymbolCache): + print cvs_to_refname(symbol) + progress.finish() + print # terminate output with blank line + return 0 + + # Synchronize local symbol cache with CVS server + progress("Synchronizing local symbol cache with CVS server...") + Globals.CvsSymbolCache.sync_all_symbols(Config.CvsRepo, progress, + valid_cvs_symbol) + + # Load current states of Git refs + progress("Loading current state of Git refs...") + changed, unchanged = 0, 0 + for cvs_symbol, cvs_state in sorted(Globals.CvsSymbolCache.items()): + git_ref = cvs_to_refname(cvs_symbol) + progress("\tChecking if Git ref is up-to-date: %s" % (git_ref)) + git_commit = Globals.GitRefMap.get(git_ref) + git_state = Globals.CommitStates.get(git_commit) + attrs = "" + if git_state and git_state == cvs_state: + attrs = " unchanged" + unchanged += 1 + else: + git_commit = "?" + changed += 1 + print "%s %s%s" % (git_commit, git_ref, attrs) + + progress.finish("Found %i CVS symbols (%i changed, %i unchanged)" % ( + changed + unchanged, changed, unchanged)) + print # terminate with blank line + return 0 + +def do_capabilities (*args): + """Do the 'capabilities' command; report supported features""" + if args: die_usage("'capabilities' takes no parameters") + print "import" + print "marks %s" % (work_path("marks")) +# print "export" +# print "export-branch" +# print "export-merges" + print # terminate with blank line + return 0 + +def do_addsymbol (*args): + """Do the 'addsymbol' command; add given CVS symbol to local cache""" + if len(args) != 1: die_usage("'addsymbol' takes one parameter: symbol") + symbol = args[0] + + progress = ProgressIndicator(" ", sys.stderr) + if valid_cvs_symbol(symbol): + Globals.CvsSymbolCache.sync_symbol( + symbol, Globals.CvsWorkDir, progress) + progress.finish("Added '%s' to CVS symbol cache" % (symbol), + True) + else: + error("Skipping CVS symbol '%s'; it is not a valid git ref", + symbol) + + print # terminate with blank line + return 0 + +def do_syncsymbols (*args): + """Do the 'syncsymbols' command; sync all symbols with CVS server""" + if args: die_usage("'syncsymbols' takes no parameters") + progress = ProgressIndicator(" ", sys.stderr) + Globals.CvsSymbolCache.sync_all_symbols(Config.CvsRepo, progress, + valid_cvs_symbol) + progress.finish() + print # terminate with blank line + return 0 + +def do_verify (*args): + """Do the 'verify' command; Compare CVS checkout and Git tree""" + if len(args) != 1: die_usage("'verify' takes one parameter: symbol") + symbol = args[0] + gitref = cvs_to_refname(symbol) + + progress = ProgressIndicator(" ", sys.stderr) + assert valid_cvs_symbol(symbol) + + progress("Checking out '%s' from CVS..." % (symbol)) + Globals.CvsWorkDir.update(symbol) + + add_env = {"GIT_INDEX_FILE": os.path.abspath(work_path("temp_index"))} + progress("Creating Git index from tree object @ '%s'..." % (gitref)) + cmd = ("git", "read-tree", gitref) + assert run_command(cmd, add_env = add_env)[0] == 0 + + progress("Comparing CVS checkout to Git index...", True) + cmd = ("git", "--work-tree=%s" % (os.path.abspath(work_path("cvs"))), + "ls-files", + "--exclude=CVS", "--deleted", "--modified", "--others", "-t") + exit_code, output, errors = run_command(cmd, add_env = add_env) + assert exit_code == 0 and not errors + + if output: + progress.finish("Failed verification of '%s'" % (symbol), True) + error("The '%s' command returned:\n---\n%s---", " ".join(cmd), + output) + else: + progress.finish("Successfully verified '%s'" % (symbol), True) + + print # terminate with blank line + return exit_code + +def not_implemented (*args): + die_usage("Command not implemented") + +Commands = { + "capabilities": do_capabilities, + "list": do_list, + # Special handling of 'import' in main() + # "import": do_import, + "export": not_implemented, + # Custom commands + "addsymbol": do_addsymbol, + "syncsymbols": do_syncsymbols, + "verify": do_verify, +} + +class Globals (object): + """Global variables are placed here at the start of main()""" + pass + +def main (*args): + debug("Invoked '%s'", " ".join(args)) + + ### Initialization of subsystems + + # Read config for the given remote + assert len(args) >= 2 + Config.init(args[1]) + + # Local CVS symbol cache (CVS symbol -> CVS state mapping) + Globals.CvsSymbolCache = CvsSymbolCache(work_path("symbols")) + + # Local CVS checkout + Globals.CvsWorkDir = CvsWorkDir(work_path("cvs"), Config.CvsRepo) + + # Interface to 'git cat-file --batch' + Globals.GitObjectFetcher = GitObjectFetcher() + + # Interface to Git object notes + Globals.GitNotes = GitNotes(Config.NotesRef, Globals.GitObjectFetcher) + + # Mapping from Git commit objects to CVS states + Globals.CommitStates = CommitStates(Globals.GitNotes) + + # Mapping from Git ref names to Git object names + Globals.GitRefMap = GitRefMap(Globals.GitObjectFetcher) + + # Mapping from CVS revision to Git blob and commit objects + Globals.CvsRevisionMap = CvsRevisionMap( + cvs_to_refname("_metadata"), Globals.GitObjectFetcher) + last_mark = 0 + if Globals.CvsRevisionMap.has_unresolved_marks(): + # Update with marks from last import + last_mark = Globals.CvsRevisionMap.load_marks_file( + work_path("marks")) + else: + # Truncate marks file. We cannot automatically do this after + # .load_marks_file() above, since we cannot yet guarantee that + # we will be able to save the revision map persistently. (That + # can only happen if we are given one or more import commands + # below.) We can only truncate this file when we know there are + # no unresolved marks in the revision map. + open(work_path("marks"), "w").close() + + # Mapping from CVS states to commit objects that contain said state + Globals.CvsStateMap = CvsStateMap(Globals.CvsRevisionMap) + + ### Main program loop + + import_refs = [] # accumulate import commands here + # cannot use "for line in sys.stdin" for buffering (?) reasons + line = sys.stdin.readline() + while (line): + cmdline = line.strip().split() + if not cmdline: break # blank line means we're about to quit + + debug("Got command '%s'", " ".join(cmdline)) + cmd = cmdline.pop(0) + + if cmd == "import": + import_refs.extend(cmdline) + else: + if cmd not in Commands: + die_usage("Unknown command '%s'", cmd) + if Commands[cmd](*cmdline): + die("Command '%s' failed", line.strip()) + sys.stdout.flush() + line = sys.stdin.readline() + + ret = 0 + if import_refs: # trigger import processing after last import command + # Init producer of output in the git-fast-import format + Globals.GitFastImport = GitFastImport( + sys.stdout, Globals.GitObjectFetcher, last_mark) + + # Perform import of given refs + ret = do_import(*import_refs) + + ### Notes on persistent storage of subsystems' data structures: + # + # Because the "import" command has been called, we here _know_ + # that there is a fast-import process running in parallel. + # (This is NOT the case when there are no "import" commands). + # We can therefore now (and only now) safely commit the extra + # information that we store in the Git repo. + # In other words, the data structures that we commit to + # persistent storage with the following calls will NOT be + # committed if there are no "import" commands. The data + # structures must handle this in one of two ways: + # - In the no-"import" scenario, there is simply nothing to + # commit, so it can safely be skipped. + # - Any information that should have been committed in the + # no-"import" scenario can be reconstructed repeatedly in + # subsequent executions of this program, until the next + # invocation of an "import" command provides an opportunity + # to commit the data structure to persistent storage. + + # Write out commit notes (mapping git commits to CvsStates) + # The following call would be a no-op in the no-"import" case + Globals.GitNotes.commit_notes( + Globals.GitFastImport, Config.Author, + 'Annotate commits imported by "git remote-cvs"\n') + + # Save CVS revision metadata + # This data structure can handle the no-"import" case as long + # as the marks file from the last fast-import run is still + # present upon the next execution of this program. + Globals.CvsRevisionMap.commit_map( + Globals.GitFastImport, Config.Author, + 'Updated metadata used by "git remote-cvs"\n') + + return ret + +if __name__ == '__main__': + sys.exit(main(*sys.argv)) -- 1.6.4.rc3.138.ga6b98.dirty -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html