On 8/29/2018 1:14 PM, Junio C Hamano wrote:
Ben Peart <Ben.Peart@xxxxxxxxxxxxx> writes:
diff --git a/Documentation/config.txt b/Documentation/config.txt
index 1c42364988..79f8296d9c 100644
--- a/Documentation/config.txt
+++ b/Documentation/config.txt
@@ -2391,6 +2391,12 @@ imap::
The configuration variables in the 'imap' section are described
in linkgit:git-imap-send[1].
Adding something like
You can disable multi-threaded code by setting this variable
to 'false' (or 1).
may reduce the risk of a similar "Huh?" reaction by other readers.
Will do
+struct load_cache_entries_thread_data
+{
+ pthread_t pthread;
+ struct index_state *istate;
+ struct mem_pool *ce_mem_pool;
+ int offset, nr;
+ void *mmap;
+ unsigned long start_offset;
+ struct strbuf previous_name_buf;
+ struct strbuf *previous_name;
+ unsigned long consumed; /* return # of bytes in index file processed */
+};
We saw that Duy's "let's not use strbuf to remember the previous
name but instead use the previous ce" approach gave us a nice
performance boost; I wonder if we can build on that idea here?
One possible approach might be to create one ce per "block" in the
pre-scanning thread and use that ce as the "previous one" in the
per-thread data before spawning a worker.
Yes, I believe this can be done. I was planning to wait until both
patches settled down a bit before adapting it to threads. It's a little
trickier because the previous ce doesn't yet exist but I believe one can
be fabricated enough to make the optimization work.
+static unsigned long load_cache_entries(struct index_state *istate,
+ void *mmap, size_t mmap_size, unsigned long src_offset)
+{
+ struct strbuf previous_name_buf = STRBUF_INIT, *previous_name;
+ struct load_cache_entries_thread_data *data;
+ int nr_threads, cpus, ce_per_thread;
+ unsigned long consumed;
+ int i, thread;
+
+ nr_threads = git_config_get_index_threads();
+ if (!nr_threads) {
+ cpus = online_cpus();
+ nr_threads = istate->cache_nr / THREAD_COST;
Here, nr_threads could become 0 with a small index, but any value
below 2 makes us call load_all_cache_entries() by the main thread
(and the value of nr_thread is not used anyore), it is fine. Of
course, forced test will set it to 2 so there is no problem, either.
OK.
+ /* a little sanity checking */
+ if (istate->name_hash_initialized)
+ die("the name hash isn't thread safe");
If it is a programming error to call into this codepath without
initializing the name_hash, which I think is the case, this is
better done with BUG("").
Will do
The remainder of the patch looked good. Thanks.