Glad to see you tackling this. This is definitely a step in the right
direction.
I realize that it will take a lot of work and that intermediate steps
may just be pushing it the global state one level higher but eventually
it would be great to see an entire code path global state free!
I'm personally interested because reducing the reliance on global state
also helps us in our performance work as it makes it more possible to
use threading to scale up the performance.
Ben
On 5/18/2017 7:21 PM, Brandon Williams wrote:
When I first started working on the git project I found it very difficult to
understand parts of the code base because of the inherently global nature of
our code. It also made working on submodules very difficult. Since we can
only open up a single repository per process, you need to launch a child
process in order to process a submodule. But you also need to be able to
communicate other stateful information to the children processes so that the
submodules know how best to format their output or match against a
pathspec...it ends up feeling like layering on hack after hack. What I would
really like to do, is to have the ability to have a repository object so that I
can open a submodule in-process.
Before this becomes a reality for all commands, much of the library code would
need to be refactored in order to work purely on handles instead of global
state. As it turned out, ls-files is a pretty simple command and doesn't have
*too* many dependencies. The biggest thing that needed to be changed was
piping through an index into a couple library routines so that they don't
inherently rely on 'the_index'. A few of these changes I've sent out and can
be found at 'origin/bw/pathspec-sans-the-index' and
'origin/bw/dir-c-stops-relying-on-the-index' which this series is based on.
Patches 1-16 are refactorings to prepare either library code or ls-files itself
to be ready to handle passing around an index struct. Patches 17-22 introduce
a repository struct and change a couple of things about how submodule caches
work (getting submodule information from .gitmodules). And Patch 23 converts
ls-files to use a repository struct.
The most interesting part of the series is from 17-23. And 1-16 could be taken
as is without the rest of the series.
This is still very much in a WIP state, though it does pass all tests. What
I'm hoping for here is to get a discussion started about the feasibility of a
change like this and hopefully to get the ball rolling. Is this a direction we
want to move in? Is it worth the pain?
Thanks for taking the time to look at this and entertain my insane ideas :)
Brandon Williams (23):
convert: convert get_cached_convert_stats_ascii to take an index
convert: convert crlf_to_git to take an index
convert: convert convert_to_git_filter_fd to take an index
convert: convert convert_to_git to take an index
convert: convert renormalize_buffer to take an index
tree: convert read_tree to take an index parameter
ls-files: convert overlay_tree_on_cache to take an index
ls-files: convert write_eolinfo to take an index
ls-files: convert show_killed_files to take an index
ls-files: convert show_other_files to take an index
ls-files: convert show_ru_info to take an index
ls-files: convert ce_excluded to take an index
ls-files: convert prune_cache to take an index
ls-files: convert show_files to take an index
ls-files: factor out debug info into a function
ls-files: factor out tag calculation
repo: introduce new repository object
repo: add index_state to struct repo
repo: add per repo config
submodule-config: refactor to allow for multiple submodule_cache's
repo: add repo_read_gitmodules
submodule: add is_submodule_active
ls-files: use repository object
Makefile | 1 +
apply.c | 2 +-
builtin/blame.c | 2 +-
builtin/commit.c | 3 +-
builtin/ls-files.c | 348 ++++++++++++++++-----------------
cache.h | 4 +-
combine-diff.c | 2 +-
config.c | 2 +-
convert.c | 31 +--
convert.h | 19 +-
diff.c | 6 +-
dir.c | 2 +-
git.c | 2 +-
ll-merge.c | 2 +-
merge-recursive.c | 4 +-
repo.c | 112 +++++++++++
repo.h | 22 +++
sha1_file.c | 6 +-
submodule-config.c | 40 +++-
submodule-config.h | 10 +
submodule.c | 51 +++++
submodule.h | 2 +
t/t3007-ls-files-recurse-submodules.sh | 39 ++++
tree.c | 28 ++-
tree.h | 3 +-
25 files changed, 513 insertions(+), 230 deletions(-)
create mode 100644 repo.c
create mode 100644 repo.h