Andreas Ericsson <ae@xxxxxx> wrote: > Shawn O. Pearce wrote: >> >> Eh, I disagree here. In git.git today "struct commit" exposes its >> buffer with the canonical commit encoding. Having that visible >> wrecks what Nico and I were thinking about doing with pack v4 and >> encoding commits in a non-canonical format when stored in packs. >> Ditto with trees. > > Err... isn't that backwards? No. > Surely you want to store stuff in the > canonical format so you're forced to do as few translations as > possible? No. We suspect that canonical format is harder to decompress and parse during revision traversal. Other encodings in the pack file may produce much faster runtime performance, and reduce page faults (due to smaller pack sizes). We hardly ever use the canonical format for actual output; most output rips the canonical format apart and then formats the data the way it was requested. If we have the data *already* parsed in the pack its much faster to output. > Or are you trying to speed up packing by skipping the > canonicalization part? Wrong; we're trying to speed up reading. Packing may go slower, especially during the first conversion of v2->v4 for any given repository, but packing is infrequent so the minor (if any) drop in performance here is probably worth the reading performance gains. > Well, if macro usage is adhered to one wouldn't have to worry, > since the macro can just be rewritten with a function later (if, > for example, translation or some such happens to be required). > Older code linking to a newer library would work (assuming the > size of the commit object doesn't change anyway), You are assuming too much magic. If the older ABI used a macro and the newer one (which supports pack v4) organized struct commit differently and the user upgrades libgit2.so the older applications just broke, horribly. We know we want to do pack v4 in the near future. Or at least experiment with it and see if it works. If it does, we don't want to have to cause a major ABI breakage across all those newly installed libgit2s... yikes. I'm really in favor of accessor functions for the first version of the library. They can always be converted to macros once someone shows that their git visualizer program saves 10 ms on a 8,000 ms render operation by avoiding accessor functions. I'd rather spend our brain cycles optimizing the runtime and the in-core data so we spend less time in our tight revision traversal loops. Seriously. We make at least 10 or 11 function calls *per commit* that comes out of get_revision(). If the formatting application is really suffering from its 4 or 5 accessor function calls in order to get that returned data, we probably should also be looking at how we can avoid function cals in the library. Oh, and even with 4 or 5 accessor functions per commit in the application that is *still* better than the 10 or so calls the application probably makes today scraping "git log --format=raw" off a pipe and segment it into the different fields it needs. Unless pipes in Linux somehow allow negative time warping with CPU counters. Though on dual-core systems they might, since the two processes can run on different cores. But oh, you didn't want to worry about threading support too much in libgit2, so I guess you also don't want to use multi-core systems. -- Shawn. -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html