On 7/20/21 10:26 PM, Vladislav Bolkhovitin wrote: > Hi, > > Great to see those patches coming! After some review, they look to be > very well done. Some comments/suggestions below. > > 1. I strongly recommend to implement DH exponentials reuse (g x mod p / > g y mod p as well as g xy mod p) as specified in section 8.13.5.7 > "DH-HMAC-CHAP Security Requirements". When I was working on TP 8006 I > had a prototype that demonstrated that DH math has quite significant > latency, something like (as far as I remember) 30ms for 4K group and few > hundreds of ms for 8K group. For single connection it is not a big deal, > but imagine AMD EPYC with 128 cores. Since all connections are created > sequentially, even with 30 ms per connection time to complete full > remote device connection would become 128*30 => almost 4 seconds. With > 8K group it might be more than 10 seconds. Users are unlikely going to > be happy with this, especially in cases, when connecting multiple of > NVMe-oF devices is a part of a server or VM boot sequence. > Oh, indeed, I can confirm that. FFDHE calculations are quite time-consuming. But incidentally, ECDH and curve25519 are reasonably fast, so maybe there _is_ a value in having a TPAR asking for them to be specified, too ... > If DH exponential reuse implemented, for all subsequent connections the > DH math is excluded, so authentication overhead becomes pretty much > negligible. > > In my prototype I implemented DH exponential reuse as a simple > per-host/target cache that keeps DH exponentials (including g xy mod p) > for up to 10 seconds. Simple and sufficient. > Frankly, I hadn't looked at exponential reuse; this implementation really is just a first step to get feedback from people if this is a direction they want to go. > Another, might be ever more significant reason why DH exponential reuse > is important is that without it x (or y on the host side) must always be > randomly generated each time a new connection is established. Which > means, for instance, for 8K groups for each connection 1KB of random > bytes must be taken from the random pool. With 128 connections it is now > 128KB. Quite a big pressure on the random pool that DH exponential reuse > mostly avoids. > > Those are the 2 reasons why we added this DH exponential reuse sentence > in the spec. In the original TP 8006 there was a small informative piece > explaining reasonings behind that, but for some reasons it was removed > from the final version. > Thanks for the hint. I'll be adding exponential reuse to the code. > 2. What is the status of this code from perspective of stability in face > of malicious host behavior? Seems implementation is carefully done, but, > for instance, at the first look I was not able to find a code to clean > up if host in not acting for too long in the middle of exchange. Other > observation is that in nvmet_execute_auth_send() > nvmet_check_transfer_len() does not check if tl size is reasonable, > i.e., for instance, not 1GB. > That is true; exchange timeouts are missing. Will be adding them, of course. And haven't thought of checking for tl size overflows; will be adding them, too. > For sure, we don't want to allow remote hosts to hang or crash target. > For instance, because of OOM conditions that happened, because malicious > host asked target to allocate too much memory or open to many being > authenticated connections in which the host is not going to reply in the > middle of exchange. > This is something I'll need to look at, anyway. What we do not want is a userspace application chipping in and send a 'negotiate' command without any subsequent steps, thereby invalidating the existing authentication. > Asking, because don't want to go in my review too far ahead from the > author ;) > > In this regard, it would be great if you add in your test application > ability to perform authentication with random parameters and randomly > stop responding. Overnight running of such test would give us good > degree of confidence that it will always work as expected. > That indeed would be good; let me think on how something like that can be implemented. Cheers, Hannes -- Dr. Hannes Reinecke Kernel Storage Architect hare@xxxxxxx +49 911 74053 688 SUSE Software Solutions Germany GmbH, Maxfeldstr. 5, 90409 Nürnberg HRB 36809 (AG Nürnberg), GF: Felix Imendörffer