On Thu, Jan 10, 2019 at 01:40:31AM -0500, Jeff King wrote: > On Thu, Jan 10, 2019 at 04:25:46AM +0000, brian m. carlson wrote: > > > There are a small number of places in our codebase where we cast a > > buffer of unsigned char to a struct object_id pointer. When we have > > GIT_MAX_RAWSZ set to 32 (because we have SHA-256), one of these places > > (the buffer for tree objects) can lead to us copying too much data when > > using SHA-1 as the hash, since there are only 20 bytes to read. > > > > This was not expected to be a problem before future code was introduced, > > but due to a combination of series the issue became noticeable. > > > > This series introduces a refactor to avoid referencing the struct > > object_id directly from a buffer and instead storing an additional > > struct object_id (and an int) in struct name_entry and referring to > > that. > > I think this is really the only safe and sane solution. We resisted it > because of the cost of the extra copies (especially the > update_tree_entry() one). But I don't know that anybody actually > measured it. Do you have any performance numbers before/after this > series? Unfortunately, I don't. I'm not really sure in what situations we hit this code path a lot, so I'm not sure what exactly we should performance test. If you have suggestions, I can set up some perf tests. I will say that I resisted writing this series for a long time, since I knew it was going to be a difficult slog to get everything working (and it was). If there had been a way to avoid it, I would have done it. However, writing this series led to about ten tests in my SHA-256 work suddenly passing where they hadn't before. -- brian m. carlson: Houston, Texas, US OpenPGP: https://keybase.io/bk2204
Attachment:
signature.asc
Description: PGP signature