On Tue, Feb 28, 2017 at 11:53:28AM -0800, Junio C Hamano wrote: > Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx> writes: > > > That > > actually can clean up some code, because right now we have duplicate > > interfaces for some things that take an oid pointer and others take a > > "const unsigned char *sha1", and that duplication essentially can go > > away. > > ... I understand that this is an eventual goal of "struct object_id" > effort. Yes, it is. > > ..., and the lines are generally shorter, eg > > > > - const unsigned char *mb = result->item->object.oid.hash; > > - if (!hashcmp(mb, current_bad_oid->hash)) { > > > > turns into > > > > + const hash_t *mb = &result->item->object.oid; > > + if (!hashcmp(mb, current_bad_oid)) { > > Hmph. I somehow thought the longer term directio for the above code > would be to turn it into > > if (!oidcmp(&result->item->object.oid, ¤t_bad_oid)) > > instead. It is. The duplication is temporary. We've actually removed some of the original SHA-1 functions in favor of the object_id versions. > Having said all that, I do not offhand see a huge benefit of the > current layout that has one layer between the hash (i.e. oid.hash) > and the object name (i.e. oid) over "there is no need for oid.hash; > oid is just a hash", which you seem to be doing. > > Except for cases where we need to be prepared to see two or more > kinds of object names (in which case struct object_id would have an > enum that tells you if oid.hash is SHA-1 or SHA-3-256), that is. Of > course we can encode that in hash_t itself (e.g. multihash). Right, this is indeed a benefit. The bigger issue is the assumptions in the code base that assume a given hash size. These assumptions are in a bunch of places, so pretty much any time you see a number, you end up having to question what it means. We have a 64 that means 40 (SHA-1 hex) plus 24 other values, for example. I've sent various series because I think that's the way that the Git development community has most welcomed my contributions. I have another 69 patches in two series (which are as of yet untested). I can probably have almost all of the transition actually complete in another couple weeks (because I have a day job which does not involve working on Git). I've hesitated to send additional patches to the list when my existing changes haven't even hit next yet, since they're received little actual testing. I'm unsure if sending one enormous patch is actually going to be welcome; I feel Junio and most of the regulars are probably not going to be very excited about the idea, mostly because the reviewing task is enormous, and the risk for breakage is also pretty high. We want to move away from SHA-1 as quickly as possible, yes, but we don't want Git to become unusably buggy. I guess what I'm saying is that I'm fine if we want to have a more aggressive timeline, and I can probably make that happen, at least if we can wait a few weeks. If Junio and the other regulars are comfortable merging one omnibus patch to get it moving even more quickly, I won't stand in the way. -- brian m. carlson / brian with sandals: Houston, Texas, US +1 832 623 2791 | https://www.crustytoothpaste.net/~bmc | My opinion only OpenPGP: https://keybase.io/bk2204
Attachment:
signature.asc
Description: PGP signature