Re: [HELP] Adding git awareness to the darning patch management system.

Peter Williams <peter_ono@xxxxxxxxxxxxxxxxxxxxx> · Fri, 02 Dec 2011 09:32:50 +1000

On 01/12/11 16:27, Jeff King wrote:
On Thu, Dec 01, 2011 at 10:56:59AM +1000, Peter Williams wrote:

I'm not exactly sure what this means.

If you look at the screenshots at sourceforge (which were produced on
top of a Mercurial repo) you'll notice that file names in the left
most tree have letters in front of them and appear in different
foreground colours.  These letters are the same as those returned by
Mercurial's status command and, hence, give a Mercurial user an easy
to understand snapshot of the status of the files in the playground.
The colour coding is (relatively) arbitrary (and chosen by me) and is
intended to make it easier to detect the different file statuses.

My main problem is that I can't find a git file status command (and
there are a lot of them to choose from) that gives a snapshot of the
statuses of all files in a directory (including those not tracked or
ignored).

Thanks, that helps. You probably just want to use "git status
--porcelain", which will show you the state of file modification with
respect to the index and the prior commit, as well as any untracked
files. See the "porcelain format" section in "git help status".

Note that "git status" will not print files which are not modified. You
may want to also run "git ls-files" to get the full listing of files,
including unmodified ones.

A secondary problem is that, if I could cobble together statuses from
various commands, mapping git statuses to the Mercurial ones for
display would not be a good solution as they would not necessarily
make sense to a git user.  (It's fairly clear to me from my inability
to make sense of git's CLI that git users think differently to me, a
Mercurial user, and it's unlikely that I can, without help, make a
file tree display that makes sense to a git user.)

I'm hoping that "git status --porcelain" will give you a fairly close
mapping of the basic "what happened to this file" concept, based on what
I see in the second screenshot you mentioned.

The trickiest thing is the index, which represents an in-between state
that is not usually exposed by other version control systems. If your
tool does not make use of the index, then it probably makes sense to
just consider a path as modified if it has modifications staged in the
index or in the working tree, which maps to other VCS's idea of
"modified" (because for them, marking something as to-be-committed and
commiting it are part of the same step).

Yes, I think your right.  For most of my purposes, I think that it's 
irrelevant whether a change is staged or not and the choices that I 
offer allow the user to do what he thinks is right for a file with 
changes that are staged but uncommitted.  For me to automatically do 
something based on whether the file was staged for a commit would be a 
mistake as I would be reducing the user's options.

However, the distinction might be worth making in the file tree display 
to remind the user what's staged and what's not?

For this, you probably want "git diff-files --name-only", which will
show files with differences in the working tree. Keep in mind that git
has an "index" or "staging area", which means that you have three states
of content for a given path:

   1. the state of the prior commit (i.e., HEAD)

   2. the state that is marked to be committed when "git commit" is run
      (i.e., the index)

   3. the state in the working tree

This is a prime example of the different mindset of the git user to
the hg user.

You don't have to use those features, of course. It's just that
something like "git status" is going to report on the differences
between those states, so as a tool writer you need to know they are
there (and as I said above, you are free to simplify if it fits into the
mental model of your tool).

You can compare the first two with "git diff-index", and the latter two
with "git diff-files". You can also use "git status --porcelain" to get
a machine-readable output that shows how the three states match up, with
one line per file.

This is an example of why I'm confused.  There are too many ways to
do (similar) things and it's hard to know which to use.

Git is made of little building blocks. The original way to see the
differences between the index and the working tree was via diff-files.
But then people build bigger building blocks out of the smaller ones.
"git status" is really just a shorthand for:

   git diff-index HEAD&&
   git diff-files&&
   git ls-files -o

and is in fact implemented using those building blocks (originally as a
shell script, though these days it is written in C). So you can choose
either and get the same information. Choosing a higher-level building
block may save you some work, if the abstraction matches what you want.
Otherwise, you can compose what you want from the lower levels.

I know it sometimes leads to an overwhelming number of commands, and I'm
not trying to excuse git's tendency to confuse people. I'm just hoping
to unconfuse you in this particular situation.

As an aside, I found it easier to delve into git's innards to find out 
how to implement git binary patches than I did finding out how to do 
things from the CLI :-).

In your case, I think "status" is the most convenient level of
abstraction for you, because you are interesting in looking at
differences to both the index and HEAD (i.e., the prior commit). But if
you find as you implement that want more flexibility, you can switch to
using the lower-level commands yourself.

I'll investigate this approach.  How easy is it to distinguish low level 
commands from high level commands?

Maybe an example of why I think the feature is useful might help.
Say that you start editing a file and then decide that you want to
put this change into a patch rather than committing it.  If you were
using quilt you would have to do this manually by any of a number or
ways such as:

$<git diff command>  file>  temp.patch
$<git revert command>  file
$ quilt new one.patch
$ quilt add file
$ patch -p1 file<  temp.patch
$ rm temp.patch

In darning, you just do:

$ darn new one.patch
$ darn add --absorb file

Sure. We have stgit and topgit, which do similar patch management things
on top of git. I don't personally user either, though, so I don't have
much to say on how they compare to darning, or whether it is worth
looking at their implementations.

And there's MQ on top of hg.  I find the idea of doing "temporary" 
commits (which is what these tools are essentially doing) a little risky 
(e.g. what happens if you do a push with temporary commits in place). 
With MQ, I use its hook system to prevent this happening and I imagine 
git provides something similar.

Of course, these tools have the advantage that it's easier to promote a 
patch to a full blown commit than it is for quilt or darning in its 
current form (I'm thinking about how to do this).  At this stage, 
darning is targeted at the user who has to maintain a set of patches on 
top of a third party source tree without the need to eventually commit 
the changes themselves (i.e. distribution managers).

I've never used stgit or topgit but I have used both quilt and MQ a lot. 
 I find them both quite usable but each with their own set of 
advantages and disadvantages hence my attempt to make a tool as much 
like them as possible but with a smaller set of disadvantages.

The interface to the SCM to support this is two functions:

1: get_files_with_uncommitted_changes() which called with no
arguments returns a list of the paths of all files with uncommitted
changes or when given a list of file paths (the more common case)
returns the subset of that list which have uncommitted changes; and

"status" will do this for you, modulo the simplification of the concept
of the index, as we discussed above.

2. copy_clean_version_to(filepath, target_path) which makes a copy of
the file as recorded in the prior commit and places it at the
target_path (usually where darning stores the "original" for
reference when creating diffs).

You probably want:

   git cat-file blob HEAD:filepath>target_path

I think I might do this in two stages.  First, just do the bit about 
adding files and pushing (as that is the most useful) and leave the file 
tree as a vanilla tree for the time being (as it looks like it may be 
more complicated).

Thanks for your help,
Peter
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html