Re: [PATCH 1/3] Add --blob-filter option to filter-branch.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi,

On Wed, 23 Apr 2008, Avery Pennarun wrote:

> From: Jeff King <peff@xxxxxxxx>
> 
> On Tue, Apr 22, 2008 at 12:51:14PM -0400, Avery Pennarun wrote:
> 
> > Do you think git would benefit from having a generalized version of
> > this script?  Basically, the user provides a "munge" script on the
> > command line, and there's a git-filter-branch mode for auto-munging
> > (with a cache) every file in every checkin.  Even if it's *only* ever
> > used for CRLF, I can imagine this being useful to a lot of people.
> 
> It was easy enough to work up the patch below, which allows
> 
>   git filter-branch --blob-filter 'tr a-z A-Z'
> 
> However, it's _still_ horribly slow. Shell script is nice and flexible,
> but running a tight loop like this is just painful. I suspect
> filter-branch in something like perl would be a lot faster and just as
> flexible (you could even do it in C, but you'd probably have to invent a
> little domain-specific scripting language).
> 
> It is still much better performance than a tree filter, though:
> 
>   $ cd git && time git filter-branch --tree-filter '
>       find . -type f | while read f; do
>         tr a-z A-Z <"$f" >tmp
>         mv tmp "$f"
>       done
>     ' HEAD~10..HEAD
> 
>   real    4m38.626s
>   user    1m32.726s
>   sys     2m51.163s
> 
>   $ cd git && git filter-branch --blob-filter 'tr a-z A-Z' HEAD~10..HEAD
>   real    1m40.809s
>   user    0m36.822s
>   sys     1m14.273s
> 
> Lots of system time in both. I'm sure we spend a fair bit of time
> hitting our very large map and blob-cache directories, which would be
> much more nicely implemented as associative arrays in memory (if we were
> using a more featureful language).
> 
> Anyway, here is the patch. I don't know if it is even worth applying,
> since it is still painfully slow.

Not all of this belongs in the commit messaage.

> Acked-by: Johannes Schindelin <johannes.schindelin@xxxxxx>

This does.

A good general rule is: if you think it would be funny/strange to read 
this message in the output of "git log", it should be changed.

Ciao,
Dscho
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux