On 09/11/2013 07:20 PM, James Yonan wrote:
On 10/09/2013 12:57, Daniel Borkmann wrote:
There was a similar patch posted some time ago [1] on lkml, where
Florian (CC) made a good point in [2] that future compiler optimizations
could short circuit on this. This issue should probably be addressed in
such a patch here as well.
[1] https://lkml.org/lkml/2013/2/10/131
[2] https://lkml.org/lkml/2013/2/11/381
On 11/09/2013 06:19, Marcelo Cerri wrote:
The discussion that Daniel pointed out has another interesting point
regarding the function name. I don't think it's a good idea to name it
crypto_memcpy since it doesn't have behavior the same way as strcmp.
Florian suggested in the thread names such crypto_mem_equal, which I
think fits better here.
Ok, here's another stab at this:
* Changed the name to crypto_mem_not_equal. The "not_equal" seems to
make more sense because the function returns a nonzero "true" value if
the memory regions are not equal.
Ok, sounds good.
* Good point that a smart optimizer might add instructions to
short-circuit the loop if all bits in ret have been set. One way to
deal with this is to disable optimizations that might increase code
size, since a short-circuit optimization in this case would require
adding instructions.
#pragma GCC optimize ("Os")
The nice thing about using #pragma is that older versions of gcc that
don't recognize it will simply ignore it, and we can probably presume
that older versions of gcc do not support a short-circuit optimization
if the latest one does not. I did a quick test using gcc 3.4.6 at -O2,
and did not see any evidence of a short-circuit optimization.
* Improved performance when CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS is
enabled. This makes the performance roughly on-par with memcmp.
Hm, why don't we take fixed-size versions of Daniel J Bernstein from NaCl
library [1], e.g. for comparing hashes?
E.g. for 16 bytes:
int crypto_verify(const unsigned char *x,const unsigned char *y)
{
unsigned int differentbits = 0;
#define F(i) differentbits |= x[i] ^ y[i];
F(0)
F(1)
F(2)
F(3)
F(4)
F(5)
F(6)
F(7)
F(8)
F(9)
F(10)
F(11)
F(12)
F(13)
F(14)
F(15)
return (1 & ((differentbits - 1) >> 8)) - 1;
}
It will return 0 if x[0], x[1], ..., x[15] are the same as y[0], y[1], ..., y[15],
otherwise it returns -1. That's w/o for loops, so probably more "compiler-proof" ...
[1] http://nacl.cr.yp.to/
----------------
#pragma GCC optimize ("Os")
noinline unsigned long crypto_mem_not_equal(const void *a, const void *b, size_t size)
{
unsigned long ret = 0;
#ifdef CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS
#if BITS_PER_LONG == 64
while (size >= 8) {
ret |= *(unsigned long *)a ^ *(unsigned long *)b;
a += 8;
b += 8;
size -= 8;
}
if (!size)
return ret;
#endif /* BITS_PER_LONG == 64 */
if (sizeof(unsigned int) == 4) {
while (size >= 4) {
ret |= *(unsigned int *)a ^ *(unsigned int *)b;
a += 4;
b += 4;
size -= 4;
}
if (!size)
return ret;
}
#endif /* CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS */
while (size > 0) {
ret |= *(unsigned char *)a ^ *(unsigned char *)b;
a += 1;
b += 1;
size -= 1;
}
return ret;
}
James
--
To unsubscribe from this list: send the line "unsubscribe linux-crypto" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html