On Sun, Sep 5, 2021 at 5:59 AM ZheNing Hu <adlternative@xxxxxxxxx> wrote: > > Jeff King <peff@xxxxxxxx> 于2021年9月4日周六 下午8:50写道: > > > > On Sat, Sep 04, 2021 at 03:40:41PM +0800, ZheNing Hu wrote: > > > > > This may be a place to promote my patches: See [1][2][3]. > > > It can provide some extra atoms for git cat-file --batch | --batch-check, > > > like %(tree), %(author), %(tagger) etc. Although some performance > > > optimizations have been made, It still has small performance gap. > > > > > > If the community still expects git cat-file --batch to reuse the logic > > > of ref-filter, > > > I expect it to get the attention of reviewers. > > > > > > The solutions I can think of to further optimize performance are: > > > 1. Delay the evaluation of some ref-filter intermediate data. > > > 2. Let ref-filter code reentrant and can be called in multi-threaded to take > > > advantage of multi-core. > > > > I don't think trying to thread it will help much. For expensive formats, > > where we have to actually open and parse objects, in theory we could do > > that in parallel. But most of our time there is spent in zlib getting > > the object data, and that all needs to be done under a big lock. > > This big lock is "obj_read_lock()", right? The object reading code actually releases this lock before doing zlib decompression (and acquires it right after), to allow better multi-threaded performance. However, it is unfortunately not so simple to call object reading routines in multi-threaded code, even with this lock. The lock mainly protects `oid_object_info_extended()` and its wrappers. Some global resources used by these functions are also accessed outside of them, which could lead to race conditions in threaded code. That's why `builtin/grep.c` and `grep.c` have some explicit calls to `obj_read_lock()` outside `object-file.c` and `packfile.c`. (And it can be quite tricky to identity these cases.)