On Sun, Jul 04 2021, ZheNing Hu wrote: > Ævar Arnfjörð Bjarmason <avarab@xxxxxxxxx> 于2021年7月3日周六 下午10:18写道: > >> Most of the problem, although this may not entirely fix the performance >> regression, is that you're either looking up everything twice now, or >> taking a much more expensive path. >> > > Yeah, In get_object(), oid_object_info_extended() is called twice when we > want print object's contents. The original intention is to reduce the call of > oid_object_info_extended() once when using --textconv. Should we make > --textconv and --filters have the same logic? In this way, without using > --textconv and --filters, we can call oid_object_info_extended() only once. > >> I think using gprof is probably much more handy here. See [1. I did a >> `git rev-list --all >rla` and ran that piped into 'git cat-file --batch' >> with/without your pathces. Results: >> >> $ gprof ./git-master ./gmon-master.out | head -n 10 >> Flat profile: >> >> Each sample counts as 0.01 seconds. >> % cumulative self self total >> time seconds seconds calls ms/call ms/call name >> 14.29 0.02 0.02 475186 0.00 0.00 nth_packed_object_offset >> 14.29 0.04 0.02 237835 0.00 0.00 hash_to_hex_algop_r >> 7.14 0.05 0.01 5220425 0.00 0.00 hashcmp_algop >> 7.14 0.06 0.01 4757120 0.00 0.00 hex2chr >> 7.14 0.07 0.01 1732023 0.00 0.00 find_entry_ptr >> >> And: >> >> $ gprof ./git-new ./gmon-new.out |head -n 10 >> Flat profile: >> >> Each sample counts as 0.01 seconds. >> % cumulative self self total >> time seconds seconds calls ms/call ms/call name >> 7.32 0.06 0.06 764570 0.00 0.00 lookup_object >> 7.32 0.12 0.06 237835 0.00 0.00 parse_commit_date >> 4.88 0.16 0.04 712779 0.00 0.00 nth_packed_object_offset >> 3.66 0.19 0.03 964574 0.00 0.00 bsearch_hash >> 3.66 0.22 0.03 237835 0.00 0.00 grab_sub_body_contents >> > > It seems that lookup_object() took a lot of time with the patch of my version . > >> If you e.g. make lookup_object() simply die when it's called you'll see >> that before we don't call it at all, after your patch it's our #1 >> function. >> >> Before when we have the simplest case of writing out an object this is >> our callstack: >> >> (gdb) bt >> #0 batch_write (opt=0x7fffffffde50, data=0x555555ab9470, len=52) at builtin/cat-file.c:298 >> #1 0x000055555558b160 in batch_object_write (obj_name=0x55555597cef0 "504fe6b39f7747be6427f28d9ca97decf5e6cecf", scratch=0x7fffffffd8c0, >> opt=0x7fffffffde50, data=0x7fffffffd7f0) at builtin/cat-file.c:375 >> #2 0x000055555558b36e in batch_one_object (obj_name=0x55555597cef0 "504fe6b39f7747be6427f28d9ca97decf5e6cecf", scratch=0x7fffffffd8c0, opt=0x7fffffffde50, >> data=0x7fffffffd7f0) at builtin/cat-file.c:431 >> #3 0x000055555558b8ed in batch_objects (opt=0x7fffffffde50) at builtin/cat-file.c:588 >> #4 0x000055555558c0d3 in cmd_cat_file (argc=0, argv=0x7fffffffe1e0, prefix=0x0) at builtin/cat-file.c:716 >> #5 0x0000555555573adb in run_builtin (p=0x555555941870 <commands+240>, argc=2, argv=0x7fffffffe1e0) at git.c:461 >> #6 0x0000555555573f00 in handle_builtin (argc=2, argv=0x7fffffffe1e0) at git.c:714 >> #7 0x0000555555574182 in run_argv (argcp=0x7fffffffe08c, argv=0x7fffffffe080) at git.c:781 >> #8 0x000055555557460f in cmd_main (argc=2, argv=0x7fffffffe1e0) at git.c:912 >> #9 0x000055555565b508 in main (argc=3, argv=0x7fffffffe1d8) at common-main.c:52 >> >> After (well, here we're not even to writing it, just looking it up), the >> BUG() is my addition: >> >> (gdb) bt >> #0 BUG_fl (file=0x5555558ade71 "object.c", line=91, fmt=0x5555558ade6e "yo") at usage.c:290 >> #1 0x00005555557441ca in lookup_object (r=0x5555559755c0 <the_repo>, oid=0x555555975160 <oi>) at object.c:91 >> #2 0x000055555569dfc8 in lookup_commit (r=0x5555559755c0 <the_repo>, oid=0x555555975160 <oi>) at commit.c:62 >> #3 0x00005555557445f5 in parse_object_buffer (r=0x5555559755c0 <the_repo>, oid=0x555555975160 <oi>, type=OBJ_COMMIT, size=342, buffer=0x555555ab48e0, >> eaten_p=0x7fffffffd36c) at object.c:215 >> #4 0x0000555555785094 in get_object (ref=0x7fffffffd6b0, deref=0, obj=0x7fffffffd520, oi=0x555555975160 <oi>, err=0x7fffffffd860) at ref-filter.c:1803 >> #5 0x0000555555785c99 in populate_value (ref=0x7fffffffd6b0, err=0x7fffffffd860) at ref-filter.c:2030 >> #6 0x0000555555785d7b in get_ref_atom_value (ref=0x7fffffffd6b0, atom=0, v=0x7fffffffd628, err=0x7fffffffd860) at ref-filter.c:2064 >> #7 0x000055555578742f in format_ref_array_item (info=0x7fffffffd6b0, format=0x7fffffffde30, final_buf=0x7fffffffd880, error_buf=0x7fffffffd860) >> at ref-filter.c:2659 >> #8 0x000055555558ab1c in batch_object_write (obj_name=0x55555597e3f0 "504fe6b39f7747be6427f28d9ca97decf5e6cecf", scratch=0x7fffffffd880, >> err=0x7fffffffd860, opt=0x7fffffffde10, data=0x7fffffffd800) at builtin/cat-file.c:225 >> #9 0x000055555558ade5 in batch_one_object (obj_name=0x55555597e3f0 "504fe6b39f7747be6427f28d9ca97decf5e6cecf", scratch=0x7fffffffd880, err=0x7fffffffd860, >> opt=0x7fffffffde10, data=0x7fffffffd800) at builtin/cat-file.c:298 >> #10 0x000055555558b394 in batch_objects (batch=0x7fffffffde10, options=0x7fffffffd900) at builtin/cat-file.c:458 >> #11 0x000055555558bbd5 in cmd_cat_file (argc=0, argv=0x7fffffffe1d0, prefix=0x0) at builtin/cat-file.c:585 >> #12 0x0000555555573adb in run_builtin (p=0x555555942850 <commands+240>, argc=2, argv=0x7fffffffe1d0) at git.c:461 >> #13 0x0000555555573f00 in handle_builtin (argc=2, argv=0x7fffffffe1d0) at git.c:714 >> #14 0x0000555555574182 in run_argv (argcp=0x7fffffffe07c, argv=0x7fffffffe070) at git.c:781 >> #15 0x000055555557460f in cmd_main (argc=2, argv=0x7fffffffe1d0) at git.c:912 >> #16 0x000055555565afc1 in main (argc=3, argv=0x7fffffffe1c8) at common-main.c:52 >> > > It seems that the call stack is very deep after using my version. > >> I.e. before in batch_object_write() we could use a cheap path of doing >> oid_object_info_extended() and directly emitting the content. With your >> version we're all the way down to parse_object_buffer(). Meaning that >> we're going to be creating a "struct commit" or whatever if we're >> looking at a commit, just to print out the raw contents. >> > > I agree that the logic in ref-filter is too complicated in order to be > able to print > the structured object data. Well, maybe it isn't. I think the main problem with your performance is just looking things up again from the object store needlessly, and possibly redundantly copying things around. E.g. if you have a buffer with a commit object and you need to xstrdup() that, and xstrdup() the answer etc, it adds up. Maybe if all that's gone the formatting will still slow things down, but I bet it's going to be to a much smaller extent. >> I think the best next step here is to add a t/perf/t1006-cat-file.sh >> test to stress these various cases, i.e. a plain --batch without a >> format, with format, with --batch-all-objects etc. Try to then run that >> on each of your commits against the preceding one and see if/when you >> have regressions. >> > > Make sence. > >> Aside from any double-lookups etc, the problem is also that you're >> trying to handle a really general case (e.g. with textconv) in a > > Well, "--textconv" is a common situation? Here I may not be sure which > scenarios where the upper application calls "git cat-file --batch" are the > most common. I think --textconv is much rarer. I think it's used for API purposes e.g. for people who are doing \n translation, or maybe for the likes of git-lfs (or was that --filters)? I'm not very familiar with it. For what it's worth 'git cat-file --batch' is very performance sensitive in backend use-cases like e.g. gitlab.com's gitaly (which executes all git commands for you on that site). Anything that looks up .. any object pretty much .. is prining that object line to a persistent "cat-file --batch" process. There was a case where some very small (unrelated to cat-file per-se) slowdown in that codepath ended up slowing down overall API requests for some calls by >100%, because it's executed in a tight loop. E.g. if you're showing N commits per page that might be N cat-file --batch calls.