On Tue, Jun 12, 2018 at 03:29:47AM -0400, Eric Sunshine wrote: > On Mon, Jun 11, 2018 at 9:05 PM, brian m. carlson > <sandals@xxxxxxxxxxxxxxxxxxxx> wrote: > > test_oid would be fine. One note is that this doesn't always produce > > OIDs; sometimes it will produce other values, but as long as you don't > > think that's too confusing, I'm fine with it. > > It was surprising to see it used for non-OID's (such as hash > characteristics), but not hard to deal with. > > One could also view this as a generic key/value cache (not specific to > OID's) with overriding super-key (the hash algorithm, in this case), > which would allow for more generic name than test_oid(), but we don't > have to go there presently. It is essentially that. I'm happy with the test_oid name provided others are. My only concern is that it would be confusing. I opted to use the same mechanism for hash characteristics because it seemed better than creating a lot of one-off functions that might have duplicative implementations. But I'm open to arguments that having test_oid_rawsz, test_oid_hexsz, etc. is better. > > I agree perl would be expensive if it were invoked frequently, but > > excepting SHA1-prerequisite tests, this function is invoked 32 times in > > the entire testsuite. > > > > One of the reasons I chose perl was because we have a variety of cases > > where we'll need spaces in values, and those tend to be complex in > > shell. > > Can you give examples of cases in which values will contain spaces? It > wasn't obvious from this patch series that such a need would arise. > > Are these values totally free-form? If not, some character (such as > "_", "-", ".", etc.) could act as a stand-in for space. That shouldn't > be too hard to handle. t6030, which tests the bisect porcelain, is sensitive to the hash algorithm because hash values are used as a secondary sort for the closest commit. Without totally gutting the test and redoing it, some solution to produce something resembling a sane commit message would be helpful. We will also have cases where we need to provide strings to printf(1), such as in some of the pack tests. I have a minor modification to your code which handles that at the cost of restricting us to one hash-key-value tuple on a line, which I think is an acceptable tradeoff. > > Using shell variables like this does have the downside that we're > > restricted to only characters allowed in shell variables. That was > > something I was trying to avoid, but it certainly isn't fatal. > > Is that just a general concern or do you have specific "weird" keys in mind? I had originally thought of providing only the SHA-1 value instead of a named key, which would have necessitated allowing arbitrary inputs, but I ultimately decided that named keys were better. I also tend to prefer dashes in inputs over underscores, since it's a bit easier to type, but that's really a secondary concern. I think the benefits of an implementation closer to your outweigh the downsides. > My original version of test_oid_cache() actually allowed for that by > caching _all_ information from the tables rather than only values > corresponding to $test_hash_algo. It looked like this: > > --- >8 --- > test_oid_cache () { > while read tag rest > do > case $tag in \#*) continue ;; esac > > for x in $rest > do > eval "test_oid_${tag}_${x%:*}=${x#*:}" > done > done > } > --- >8 --- > > The hash algorithm is incorporated into the cache variable name like > this: "test_oid_hexsz_sha256" Yeah, I basically reimplemented something similar to that based off your code. -- brian m. carlson: Houston, Texas, US OpenPGP: https://keybase.io/bk2204
Attachment:
signature.asc
Description: PGP signature