Again, the example was merely intended to explain what the analysis you presented was missing.
Regarding the actual overhead, there's no need to recalculate since it's in the paper ;)
Specifically, if you look at Figure 13, this is the *empirical* load for *netspeed=1/1000* servers for the strawman solution where Khronos watchdog queries are performed at the same time granularity as NTPv4. So, a multiplicative factor of F should be shaved off this number. As you can see, the actual load will be orders of magnitude lower than your calculation. Higher choices of F (or lower bound on the netspeed of the participating servers) would decrease the load even further while still obtaining much better security than NTPv4.
On Wed, Jul 12, 2023 at 12:47 PM Miroslav Lichvar <mlichvar@xxxxxxxxxx> wrote:
On Wed, Jul 12, 2023 at 12:19:18PM +0300, Michael Schapira wrote:
> I believe that the issue is that your analysis does not take into account
> how many servers of each type there are. To illustrate this point, let's
> revisit your example. Suppose (for illustration purposes only) that there
> are 1000 servers with netspeed 1/1000 of the maximum and a single server
> with the maximum netspeed. Let's denote the total load by X. With NTPv4,
> each of the "slow" servers should experience (1/2000)X load while the
> "fast" server should experience X/2 load. As you said, if F=10 (the
> frequency of Khronos watchdog queries), the increase in load induced by
> Khronos here is ~0.3X. This increase is distributed equally, and so each
> server now carries ~0.3X/1000 of the additional load (around 60% increase
> on the slow servers).
>
> In the NTP pool, the fraction of servers with low netspeed (say <50)
> constitutes a very considerable fraction of the overall pool, which is why
> the disastrous scenario you mentioned does not occur (see Figure 12).
According to the Figure 12, the fraction of fastest servers is about
20%, which is 2000 times more than in your calculation getting only
a 60% increase. As a conservative estimate, if there were 800 servers
with a netspeed of 3 and 200 servers with a netspeed of 1000, each of
the 800 slower servers would be getting only 3/202400 of the global
traffic. A 30% increase in the global traffic divided by 1000 servers
is 0.13% per server, which is an increase of 8770% for the slower
servers.
If we used the whole distribution from from Figure 12, the increase
would be even higher.
The Khronos interval needs to be 1000x longer than the NTP interval to
get that number to acceptable levels. This needs to be explained in
the draft.
--
Miroslav Lichvar
-- last-call mailing list last-call@xxxxxxxx https://www.ietf.org/mailman/listinfo/last-call