Re: [PATCH v3] Threaded grep

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, Jan 19, 2010 at 01:12, Fredrik Kuivinen <frekui@xxxxxxxxx> wrote:
> On Mon, Jan 18, 2010 at 23:10, Junio C Hamano <gitster@xxxxxxxxx> wrote:
>>
>> I am wondering if this can somehow be made into a change with alot
>> smaller inpact, in spirit similar to "preloading".  The higher level
>> loop may walk the input (be it cache, tree, or directory), issuing one
>> grep_file() or grep_sha1() at a time just like the existing code, but
>> the thread-aware code adds "priming" phase that (1) populates a "work
>> queue" with a very small memory footprint, and that (2) starts more
>> than one underlying grep on different files and blobs (up to the
>> number of threads you are allowed, like your "#define THREADS 8"), and
>> that (3) upon completion of one "work queue" item, the work queue item
>> is marked with its result.
>>
>> Then grep_file() and grep_sha1() _may_ notice that the work queue item
>> hasn't completed, and would wait in such a case, or just emits the
>> output produced earlier (could be much earlier) by the worker bee.
>>
>> The low-memory-footprint work queue could be implemented as a lazy
>> one, and may be very different depending on how the input is created.
>> If we are grepping in the index, it could be a small array of <array
>> index in active_cache[], done-bit, result-bit, result-strbuf> with a
>> single integer that is a pointer into the index to indicate up to
>> which index entry the work queue has been populated.
>
> I will have to think about this a bit more. It's getting late here.

I will be sending v4 of the patch in a couple of minutes, but I want
to comment on this first.

Sure, it is probably possible to structure the code as you suggest.
However, I am not so sure that it will be a significantly smaller
change. I find the approach taken in the patch to be quite nice as the
threaded and non-threaded cases are fairly similar. There is a block
of code which deals with the threading, but that part is mostly
self-contained. In v4 I have not changed the overall approach to the
problem.

- Fredrik
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]