On 9/6/2022 7:01 PM, Jeff King wrote: > I'm sorry, I know the argument here is really hand-wavy. But I really > think this isn't making anything much less safe. I agree with you that this is the safest way to move forward here. > I was actually tempted to rip out the blob hash-check entirely by > default! Anybody who really cares about checking the bits can do so > with read_object_file(). That's what fsck does, and we could pretty > easily convert "rev-list --verify-objects" to do so, too. So this is the > less extreme version of the patch. ;) A quick search shows many uses of parse_object() across the codebase. It would certainly be nice if they all suddenly got faster by avoiding this hashing, but I also suppose that most of the calls are using parse_object() only because they are unsure if they are parsing a commit or a tag and would never parse a large blob. I think this approach of making parse_object_with_flags() is the best way to incrementally approach things here. If we decide that we need the _with_flags() version specifically to avoid this hash check, then we could probably take the second approach: remove the hash check from parse_object() and swap the places that care to use read_object_file() instead. My guess is that in the long term there will be fewer swaps to read_object_file() than to parse_object_with_flags(). However, this is a good first step to make progress without doing the time-consuming audit of every caller to parse_object(). Thanks, -Stolee