On Tue, Aug 10, 2021 at 4:20 PM ZheNing Hu <adlternative@xxxxxxxxx> wrote: > > Christian Couder <christian.couder@xxxxxxxxx> 于2021年8月10日周二 下午4:04写道: > > > > parse_object_buffer(), let's take a look at the result of gprof again: > > > > > > We need to call grab_sub_body_contents(), grab_person() to rescan the > > > buffer and extract the data. > > > What if we can combine these multiple scanning and parsing into one completion? > > > At least intuitively, this has an opportunity to improve performance. > > > > Yeah, but is there a way to check that we indeed scan or parse the > > same objects multiple times? This way we might get an idea about how > > much scanning and parsing we could save. > > I think find_subpos() called by grab_sub_body_contents() and find_wholine() > called by grab_person() are evidences that we are repeating iteratively. > But the proportion of time they occupy is too small. 0.0142% and 0.0109% Could adding traces help with measuring how many times we call functions like get_object() or format_ref_array_item() for each object? I think it's interesting to have a better idea about how many times functions that seem to take time like the above, or functions that read objects, like oid_object_info_extended(), are called for each object. > Sorry, but my attempts over the past two days have not gone well, the changes > here will make the program very complicated, the optimization here is not worth > doing. Maybe but it would be interesting to document what didn't work and why. > > > So I check the implementation > > > details of `parse_commit_buffer()` and `parse_tag_buffer()`, maybe we > > > can pass some "hook pointer" > > > to these parsing functions like `oid_object_info_extended()` does to > > > extract only the information we need? > > > > Would this also avoid scanning and parsing the same object many times? > > oid_object_info_extended()? I think it can set the pointer and extract > the required > value. Well, the problem it solves may be a little different from here. Could you explain a bit more? I wonder if we sometimes call for example oid_object_info_extended() once for an object and then parse_commit_buffer() for the same object, when perhaps calling only parse_commit_buffer() once would have given all the information we need. > > > I am thinking about whether it is possible to design a `struct > > > object_view` (temporarily called > > > `struct commit_view`) to store the offset of the parsed data in the > > > object content. `parse_commit_buffer()` > > > will check whether we need something for in-depth parsing. Like this: > > > > > > ```c > > > struct commit_view { > > > int need_tree : 1; > > > int need_parents : 1; > > > > > > int need_author : 1; > > > int need_author_name : 1; > > > int need_author_email : 1; > > > int need_author_date : 1; > > > > > > int need_committer : 1; > > > int need_committer_name : 1; > > > int need_committer_email : 1; > > > int need_committer_date : 1; > > > > Is the above info specific for each commit? Or will the above be the > > same for all the commits we are processing? > > According to the my previous thoughts, I think it is same for all commits. Yeah, so I think it doesn't make sense to duplicate it for each object. It could be a separate struct and we wouldn't need to have it in a commit slab. And I think maybe the above could be extended with fields like "need_full_parsing" or "need_object_info" that could be computed from the above fields. This could then help avoid several calls to different functions when one could be enough. Also aren't bitfields like the above "unsigned int" instead of "int" in our code base? > > Ok, so the idea is to use the commit slab feature to store the struct > > commit_view instances. That seems reasonable to me. > > It's a pity that it may not bring much optimization in real situations. Maybe it's because copying data isn't actually a performance issue. > > > It seems that GSOC has only the last few weeks left, I'm not sure how > > > far this patch series is from > > > being merged by the master branch. Performance optimization may have > > > no end. > > > > Yeah, but the idea for now is just to make using the ref-filter code > > as fast as the current code. > > It seems difficult to achieve consistent performance. As we discussed > before, the > previous `git cat-file --batch` can only provide a few kinds of > metadata that does not > need to be parsed, and after using ref-filter logic allows cat-file to > use more structured > information about git objects. So the issue seems to be that we can't detect if only a few metadata that don't need to be parsed are needed? Could fields like "need_full_parsing" or "need_object_info" that I suggest above help with that? > But this means it needs a harder path, > It requires many > traversals and many copies, Could you explain why it needs many traversal and copies in more details? And what could help avoid that? > If we really use the logic in ref-filter, > we can only do partial > optimization, we can't expect it to be as fast as the old function. > Unless we have the > opportunity to not use the logic in ref-filter, but use the new atoms > in ref-filter, this may > has a chance to escape the copy in the ref-filter. I don't know what > your opinion are... Are you sure we couldn't detect in the ref-filter code that we need only a few metadata that don't need to be parsed? And then skip fetching and parsing data we don't need (without using a completely separate code, like a fast path)?