Re: Print last time and committer a file was touched by for a whole repo

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, 2 July 2010, Tim Visher wrote:
> Thanks everyone who responded.  I ended up doing
> 
>     find . -path "./.git*" -prune -o -print -exec git log -n 1 -- '{}'
> \;> assets.txt
> 
> Little roundabout but seems effective.

Sidenote: you might want to use '--follow' on place of '--', just in rare
case you would hit file rename (or copy).

See also: https://git.wiki.kernel.org/index.php/ExampleScripts#Finding_which_commits_last_touched_the_files

> On Thu, Jul 1, 2010 at 4:12 PM, Jakub Narebski <jnareb@xxxxxxxxx> wrote:
>> Tim Visher <tim.visher@xxxxxxxxx> writes:
>>
>>> I need to get a listing of the entire contents of my current repo (as
>>> in, I don't need deleted files or anything like that, just the current
>>> snapshot) with the time the file was committed and who committed it.
>>>
>>> Thoughts on how to do that?
>>
>> There does not exist a single git command that would do what you want.
>> You would need to use 'git log -1 --follow' for each file in current
>> snapshot ('git ls-tree -r HEAD').  IIRC there is some example how to
>> do that in GitFaq or GitTips on git wiki (http://git.wiki.kernel.org).
>>
>> Perhaps in the future 'git blame <directory>' would provide such
>> output, or its equivalent (tree blame).
> 
> That'd be cool.

I am currently working on prototype in Perl, using 'git cat-file --batch'
and 'git diff-tree --stdin', as I don't know git C code/API enought to
write it in C; it is planned to be converted to C after proof of concept
works.
 
>> By the way, what do you ned this for?  Git versions whole project at
>> once, not individual files.  Is it some legacy from CVS?
> 
> Ummm...  Little embarrassing but this is apparently a requirement for
> my company.  Every few years they ask for a 'listing of all software
> assets, when they were last touched, who last touched them, and what
> version of software they were touched for.'  Generous assumptions is
> that they're probing us for how effectively we can lay our hands on
> this information.  Cynics would say that someone somewhere decided one
> day that it would be a good idea to have an __Excel Spreadsheet__
> (yep, that's what it goes into) listing every file that every software
> project everywhere in the company has, and that now people do it
> because it's on a check list.
> 
> Anywho... Hooray for `find -exec`.

Why the _files_ granularity, rather than _project_ (repository) 
granularity?  Unless you have post-CVS / post-Subversion one mega-repo
containing all projects squashed together (yuck!).

IMHVO better solution would be list, for each repository/(sub)project,
list date of last commit on master branch (when it was last touched),
list date of last signed tag / of tagged release (when it was last 
released), and shortlog or blame-based or diffstat based statistics
of code authorship (replacement of 'who last touched them').  Note
that any code metric / software kwalitee metric is subject to abuse
(numerous examples can be found at TheDailyWTF, and IIRC Joel Spolsky
and Jeff Atwood both described such dangers on their blogs).

-- 
Jakub Narebski
Poland
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]