On Wed, Nov 27, 2013 at 10:57:14AM -0800, Junio C Hamano wrote: > > Yes, I think it is a reasonable addition to the streaming API. However, > > I do not think there are any callsites which would currently want it. > > All of the current users of stream_blob_to_fd use read_sha1_file as > > their alternative, and not parse_object. So we are not verifying the > > sha1 in either case (we may want to change that, of course, but that is > > a bigger decision than just trying to bring streaming and non-streaming > > code-paths into parity). > > True. I am not offhand sure if we want to make read_sha1_file() to > rehash, but I agree that it is a question different from what we are > asking in this discussion. I'm torn on that. Having git verify everything all the time is kind of cool. But it _does_ have a performance impact, and the vast majority of the time nothing got corrupted since the last time we looked at the object. It seems like periodically running "git fsck" is a smarter way of doing periodic checks. We already are careful when objects are coming into the repository, and I think that is a sensible boundary (and I am increasingly of the opinion that running with transfer.fsckobjects off is not a good idea). The checks in parse_object seem hack-ish to me, because they catch some random subset of the times we access objects (e.g., calling parse_object on a commit sha1 will check, but calling parse_commit on an unparsed commit struct will not). If anything, I'd suggest moving the checking down to read_sha1_file, which would add it fairly consistently everywhere, and then tying it to a config option (off for high performance, on for slower-but-meticulous). -Peff -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html