Re: Fast access git-rev-list output: some OS knowledge required

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi,

On Thu, 7 Dec 2006, Andreas Ericsson wrote:

> Johannes Schindelin wrote:
> > Hi,
> > 
> > On Thu, 7 Dec 2006, Andreas Ericsson wrote:
> > 
> > > Shawn Pearce wrote:
> > > > Perhaps there is some fast IPC API supported by Qt that you could use to
> > > > run the revision listing outside of the main UI process, to eliminate
> > > > the bottlenecks you are seeing and remove the problems noted above?  One
> > > > that doesn't involve reading from a pipe I mean...
> > > > 
> > > Why not just fork() + exec() and read from the filedescriptor? You can up
> > > the output buffer of the forked program to something suitable, which means
> > > the OS will cache it for you until you copy it to a buffer in qgit (i.e.,
> > > read from the descriptor).
> > 
> > Could somebody remind me why different processes are needed? I thought that
> > the revision machinery should be used directly, by linking to libgit.a...
> > 
> 
> You wrote:
> --%<--%<--%<--
> Because, depending on what you do, the revision machinery is not
> reentrable. For example, if you filter by filename, the history is
> rewritten in-memory to simulate a history where just that filename was
> tracked, and nothing else. These changes are not cleaned up after calling the
> internal revision machinery.
> --%<--%<--%<--
> 
> When I wrote the above suggestion, I hadn't read the posts following the 
> email where I cut this text from (where Linus said "we can add a 'reset' 
> thingie to the revision walking machinery" and Marco replied with some 
> more questions).

Yes. The reset thingie is already in place: clear_commit_marks(). It would 
have to be enhanced a little, though:

1) the function rewrite_parents(), should add another flag, HALFORPHANED, 
   and
2) clear_commit_marks() should unset the "parsed" flag of the commits for 
   which HALFORPHANED is reset.

-- snip --
diff --git a/commit.c b/commit.c
index d5103cd..fd225c8 100644
--- a/commit.c
+++ b/commit.c
@@ -431,6 +431,10 @@ void clear_commit_marks(struct commit *commit, unsigned int mark)
 {
 	struct commit_list *parents;
 
+	/* were parents rewritten? */
+	if ((mark & commit->object.flags) & HALFORPHANED)
+		commit->object.parsed = 0;
+
 	commit->object.flags &= ~mark;
 	parents = commit->parents;
 	while (parents) {
diff --git a/revision.c b/revision.c
index 993bb66..461ee06 100644
--- a/revision.c
+++ b/revision.c
@@ -1097,6 +1097,7 @@ static void rewrite_parents(struct rev_info *revs, struct commit *commit)
 		struct commit_list *parent = *pp;
 		if (rewrite_one(revs, &parent->item) < 0) {
 			*pp = parent->next;
+			commit->object.flags |= HALFORPHANED;
 			continue;
 		}
 		pp = &parent->next;
diff --git a/revision.h b/revision.h
index 3adab95..544238c 100644
--- a/revision.h
+++ b/revision.h
@@ -9,6 +9,7 @@
 #define BOUNDARY	(1u<<5)
 #define BOUNDARY_SHOW	(1u<<6)
 #define ADDED		(1u<<7)	/* Parents already parsed and added? */
+#define HALFORPHANED	(1u<<8) /* parents were rewritten */
 
 struct rev_info;
 struct log_info;
-- snap --

Note that this is just the idea. This particular implementation opens a 
gaping memory leak, since the buffer of the commit is not free()d, and a 
reparse would probably not pick up on the fact that the parent commits are 
already in memory.

Ciao,
Dscho

-
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]