On Wed, Nov 07 2018, brian m. carlson wrote: > On Mon, Nov 05, 2018 at 12:39:14PM +0100, Ævar Arnfjörð Bjarmason wrote: >> On Sun, Nov 04 2018, brian m. carlson wrote: >> > + { >> > + "sha256", >> > + /* "s256", big-endian */ >> >> The existing entry/comment for sha1 is: >> >> "sha1", >> /* "sha1", big-endian */ >> >> So why the sha256/s256 difference in the code/comment? Wondering if I'm >> missing something and we're using "s256" for something. > > Ah, good question. The comment refers to the format_id field which > follows this comment. The value is the big-endian representation of > "s256" as a 32-bit value. I picked that over "sha2" to avoid confusion > in case we add SHA-512 in the future, since that's also an SHA-2 > algorithm. > > Config files and command-line interfaces will use "sha1" or "sha256", > and binary formats will use those 32-bit values ("sha1" or "s256"). Okey. >> > const char *empty_tree_oid_hex(void) >> > diff --git a/sha256/block/sha256.c b/sha256/block/sha256.c >> > [...] >> >> I had a question before about whether we see ourselves perma-forking >> this implementation based off libtomcrypt, as I recall you said yes. > > Yes. > >> Still, I think it would be better to introduce this in at least two-four >> commits where the upstream code is added as-is, then trimmed down to >> size, then adapted to our coding style, and finally we add our own >> utility functions. > > At this point, the only code that's actually used from libtomcrypt is > the block transform. The upstream code is split over multiple files in > multiple directories and won't compile in our codebase without many > files and a lot of work, so I don't feel good about either including > code that doesn't compile or including large numbers of files that don't > meet our coding standards (and that may still not compile because of > platform issues). > >> It'll make it easier to forward-port any future upstream changes. > > I don't foresee many, if any, changes to this code. It either > implements the specification or it doesn't, and it's objectively easy to > determine which. There's not even an argument to port performance > improvements, since almost everyone will be using a crypto library to > provide this code because libraries perform so dramatically better. > I've tried to not make the code perform worse than it did originally, > but that's it. > > Furthermore, the modified code carries a relatively small amount of > resemblance to the original, so if we did port changes forward, we'd > probably have conflicts. > > It seems like you really want to include the upstream code as a separate > commit and I understand where you're coming from with wanting to have > this split out into logical commits, but due to the specific nature of > this code, I see a lot of downsides and not a lot of upsides. Yeah sorry to keep bringing this up. Your way makes sense, and I'd forgotten the details since last time . I'll shut up about it:) >> > + perl -E "for (1..100000) { print q{aaaaaaaaaa}; }" | \ >> > + test-tool sha256 >actual && >> > + grep cdc76e5c9914fb9281a1c7e284d73e67f1809a48a497200e046d39ccc7112cd0 actual && >> > + perl -E "for (1..100000) { print q{abcdefghijklmnopqrstuvwxyz}; }" | \ >> > + test-tool sha256 >actual && >> >> I've been wanting to make use depend on perl >= 5.10 (previous noises >> about that on-list), but for now we claim to support >=5.8, which >> doesn't have the -E switch. > > Good point. I'll fix that. After having written a lot of one-liners, > I always write -E, and this was originally a one-liner. > >> But most importantly you aren't even using -E features here, and this >> isn't very idoimatic Perl. Instead do, respectively: >> >> perl -e 'print q{aaaaaaaaaa} x 100000' >> perl -e "print q{abcdefghijklmnopqrstuvwxyz} x 100000" > > I considered the more idiomatic version originally, but the latter could > allocate a decent amount of memory in one chunk, and I wanted to avoid > that. ~2.5MB for the latter, so trivial. > I think what I'd like to do, actually, is turn on autoflush and > use a postfix for, which would be more idiomatic and could potentially > provide better testing of the chunking code. I'll add a comment to that > effect. Okey. Maybe better to use syswrite() instead, or maybe print with autoflush is more idiomatic, I don't know.