[PATCH v2] hex2bin: make the function hex_to_bin constant-time

Mikulas Patocka <mpatocka@xxxxxxxxxx> · Mon, 25 Apr 2022 08:07:48 -0400 (EDT)

On Sun, 24 Apr 2022, Linus Torvalds wrote:

> On Sun, Apr 24, 2022 at 1:54 PM Mikulas Patocka <mpatocka@xxxxxxxxxx> wrote:
> >
> > + *
> > + * Explanation of the logic:
> > + * (ch - '9' - 1) is negative if ch <= '9'
> > + * ('0' - 1 - ch) is negative if ch >= '0'
> 
> True, but...
> 
> Please, just to make me happier, make the sign of 'ch' be something
> very explicit. Right now that code uses 'char ch', which could be
> signed or unsigned.

OK, I fixed it, here I'm sending the second version.

> Finally, for the same reason - please don't use ">> 8".  Because I do
> not believe that bit 8 is well-defined in your arithmetic. The *sign*

We are subtracting values that are in the 0 ... 255 range. So, the result 
will be in the -255 ... 255 range. So, the bits 8 to 31 of the result are 
equivalent.

> bit will be, but I'm not convinced bit 8 is.
> 
> So use ">> 31" or similar.

We can pick any number between 8 and 28. 31 won't work because the C 
standard doesn't specify that the right shift keeps the sign bit.

To make it standard-compliant, I added a cast that casts the int value to 
unsigned. We have an unsigned value with bits 8 to 31 equivalent. When we 
shift it 8 bits to the right, we have either 0 or 0xffffff - and this 
value is suitable for masking.

> Also, I do worry that this is *exactly* the kind of trick that a
> compiler could easily turn back into a conditional. Usually compilers
> tend to go the other way (ie turn conditionals into arithmetic if
> possible), but..

Some old version that I tried used "(ch - '0' + 1) * ((unsigned)(ch - '0') 
<= 9)" - it worked with gcc, but clang was too smart and turned it into a 
cmov when compiling for i686 and to a conditional branch when compiling 
for i586.

Another version used "-(c - '0' + 1) * (((unsigned)(c - '0') >= 10) - 1)" 
- it almost worked, except that clang still turned it into a conditional 
jump on sparc32 and powerpc32.

So, I came up with this version that avoids comparison operators at all 
and tested it with gcc and clang on all architectures that I could try.

Mikulas

>                     Linus
> 


From: Mikulas Patocka <mpatocka@xxxxxxxxxx>

The function hex2bin is used to load cryptographic keys into device mapper
targets dm-crypt and dm-integrity. It should take constant time
independent on the processed data, so that concurrently running
unprivileged code can't infer any information about the keys via
microarchitectural convert channels.

This patch changes the function hex_to_bin so that it contains no branches
and no memory accesses.

Note that this shouldn't cause performance degradation because the size of
the new function is the same as the size of the old function (on x86-64) -
and the new function causes no branch misprediction penalties.

I compile-tested this function with gcc on aarch64 alpha arm hppa hppa64
i386 ia64 m68k mips32 mips64 powerpc powerpc64 riscv sh4 s390x sparc32
sparc64 x86_64 and with clang on aarch64 arm hexagon i386 mips32 mips64
powerpc powerpc64 s390x sparc32 sparc64 x86_64 to verify that there are no
branches in the generated code.

Signed-off-by: Mikulas Patocka <mpatocka@xxxxxxxxxx>
Cc: stable@xxxxxxxxxxxxxxx

---
 include/linux/kernel.h |    2 +-
 lib/hexdump.c          |   32 +++++++++++++++++++++++++-------
 2 files changed, 26 insertions(+), 8 deletions(-)

Index: linux-2.6/lib/hexdump.c
===================================================================

--- linux-2.6.orig/lib/hexdump.c	2022-04-24 18:51:20.000000000 +0200
+++ linux-2.6/lib/hexdump.c	2022-04-25 13:15:26.000000000 +0200
@@ -22,15 +22,33 @@ EXPORT_SYMBOL(hex_asc_upper);
  *
  * hex_to_bin() converts one hex digit to its actual value or -1 in case of bad
  * input.
+ *
+ * This function is used to load cryptographic keys, so it is coded in such a
+ * way that there are no conditions or memory accesses that depend on data.
+ *
+ * Explanation of the logic:
+ * (ch - '9' - 1) is negative if ch <= '9'
+ * ('0' - 1 - ch) is negative if ch >= '0'
+ * we "and" these two values, so the result is negative if ch is in the range
+ *	'0' ... '9'
+ * we are only interested in the sign, so we do a shift ">> 8"; note that right
+ *	shift of a negative value is implementation-defined, so we cast the
+ *	value to (unsigned) before the shift --- we have 0xffffff if ch is in
+ *	the range '0' ... '9', 0 otherwise
+ * we "and" this value with (ch - '0' + 1) --- we have a value 1 ... 10 if ch is
+ *	in the range '0' ... '9', 0 otherwise
+ * we add this value to -1 --- we have a value 0 ... 9 if ch is in the range '0'
+ *	... '9', -1 otherwise
+ * the next line is similar to the previous one, but we need to decode both
+ *	uppercase and lowercase letters, so we use (ch & 0xdf), which converts
+ *	lowercase to uppercase
  */
-int hex_to_bin(char ch)
+int hex_to_bin(unsigned char ch)
 {
-	if ((ch >= '0') && (ch <= '9'))
-		return ch - '0';
-	ch = tolower(ch);
-	if ((ch >= 'a') && (ch <= 'f'))
-		return ch - 'a' + 10;
-	return -1;
+	unsigned char cu = ch & 0xdf;
+	return -1 +
+		((ch - '0' +  1) & (unsigned)((ch - '9' - 1) & ('0' - 1 - ch)) >> 8) +
+		((cu - 'A' + 11) & (unsigned)((cu - 'F' - 1) & ('A' - 1 - cu)) >> 8);
 }
 EXPORT_SYMBOL(hex_to_bin);
 
Index: linux-2.6/include/linux/kernel.h
===================================================================
--- linux-2.6.orig/include/linux/kernel.h	2022-04-21 17:31:39.000000000 +0200
+++ linux-2.6/include/linux/kernel.h	2022-04-25 07:33:43.000000000 +0200
@@ -285,7 +285,7 @@ static inline char *hex_byte_pack_upper(
 	return buf;
 }
 
-extern int hex_to_bin(char ch);
+extern int hex_to_bin(unsigned char ch);
 extern int __must_check hex2bin(u8 *dst, const char *src, size_t count);
 extern char *bin2hex(char *dst, const void *src, size_t count);