Re: F30: System-Wide Change proposal: DNF UUID

Lennart Poettering <mzerqung@xxxxxxxxxxx> · Tue, 8 Jan 2019 09:43:01 +0100

On Mo, 07.01.19 16:58, Stephen John Smoogen (smooge@xxxxxxxxx) wrote:

> > I wonder if it is worth introducing an entirely new tracking concept
> > here if you actually don't want to track but just count. The NTP
> > approach has the benefit that you introduce no new tracking concept at
> > all, but you just use the data that is pretty much generated
> > anyway. It also makes this all feel less one-sided, after all you
> > provide them with a deal: fedora gives the user correct time, the user
> > is therefore counted.
>
> The problems with NTP are the following:
> 1. The administrative headaches of regular
> blocks/takedowns/ill-advised security emails because our servers are
> attacking someone's box on port 123. [As funny as this sounds, getting
> regular angry emails from some site whose security tool has decided
> that 123/{tcp,udp} is a major threat still occurs. ]

This is not realistic. NTP is not really just an option, it's pretty
much a must-have in todays's Internet. You cannot properly validate
SSL certs if you don't have correct time, which shuts you out of a
good part of the Internet, including typical download servers that do
https.

If a system doesn't do NTP then it will cetrainly encounter a lot more
problems then are created by switching from NTP pool servers to Fedora
servers by default.

Moreover, afair we install and enable NTP clients by default on all
our installations, no? just like pretty much any other OS these days
does... counting by NTP mostly just means switching from NTP pool
servers to fedora's own servers.

> 2. NTP bandwidth while small per system grows a lot as you wrack up
> servers randomly checking in. Having a pool of servers around the
> world would require us to get NTP GPS clocks, getting the datacenters
> to put the antenae out and a bunch of other items. [The budget for
> this is non-zero.]

Nah, that's not how NTP works, you don't have to have a "GPS clock",
you can simply replicate the time of a set of upstream servers, that's
totally OK.

I am pretty sure Ubuntu doesn't have any fancy hw for this either,
they just provide some servers that propagate NTP pool time I figure.

> 3. Logging NTP does not cover the problem the UUID is trying to help
> solve.. there are two places where we undercount and overcount
> systems.
>  a. systems behind nat firewalls all show up as 1 ip address. ntp or
> yum or gnome-hotspot ask multiple times during a day.. but not a set
> number. Just looking at my 3 home systems I see around 1 to 80
> connections depending on what i have done that day.

The amount of traffic within a time window is linear to the number of
hosts behind that IP address. It's relatively easy to estimate that
there are 5 clients behind an IP adress if you get 5 NTP request
datagrams within one protocol iteration instead of just one...

>  b. systems on short lived dhcp ranges. multiple major isps use
> various methods which make a system look like multiple boxes. The
> system will show up as 123.45.67.89 and 2 minutes later the same
> system will be 89.76.54.123 [made up ip addresses.. but various
> carriers seem to do this.]

Well, this breaks TCP, hence sure systems will do that, but not
constantly. And all Fedora needs are estimates, and if you break
things down to some time window granularity you should be able to deal
with such IP renumbering games just fine.

> 4. NTP is a high security problem when you concentrate it to a set of
> servers. These become servers that everyone wants to hack even more
> than build systems. These problems range from DDOS to active hacks.

Uh, well, the major NTP servers tend to be pretty well tested and
fuzzed these days, and they can be sandboxed efficiently, since they
involve no big stack but only trivial SOCK_DGRAM traffic. I see no
reason whatsoever for them to be less secure than a hand-written HTTP
service that only Fedora runs and doesn't get all the validation love
the NTP servers get...

> 5. Which leads to us being in charge of the security of every kerberos
> and SSL session which relies on our clocks to be available and in
> sync. That leads to other administrative headaches where sites will
> complain that our servers broke one of those because the services was
> DDOS'd, ASN rerouted, off by N amount, UDP replayed etc.

Well, people generally already rely on entirely random people
participating in the NTP pool project to run the servers for them. If
people can depend on that they should easily be willing to depend on
Fedora for this too... I mean, they have to trust Fedora a lot more
*anyway*, since we provide them with their frickin compiled
programs...

I mean, I can see reasons why doing the NTP thing is not a good idea
for Fedora (for example: nobody willing to maintain NTP servers but
enough people commit to maintaing some other solution), but I doubt
the technical points you raise above are really valid...

> > BTW, iirc intel used to count installations through the http ping
> > check in their captive portal detection. Fedora runs a similar service
> > which is used by NM, no? maybe that's a nicer solution too: add a http
> > header field to the ping check that each client sets to "1" on one of
> > these ping checks a day, and "0" all other times. Then you count how
> > many non-zero ping checks you get within a 24h window and you have a
> > really good idea how many users you have. All without any explicit
> > tracking. And again this appears to me is a much better deal to me
> > than the uuid/dnf check that has been proposed, as you can say "we
> > provide you with ping check functionality therefore we count you":
> > both sides get something out of it.
>
> We do this but have I have found it to have problems with the NAT over
> NAT.. where we know a system should show up 288 times in a day.. but
> have seen multiple class C where every IP address shows up 1-8 times,
> but spread over a day. Are these groups of  255 systems only on for a
> short time (which could be true with certain CI environments) or is it
> 1 system getting a short ip address lifetime?

Again, if you teach each system to only send one "1" ping per 24h and
only "0" pings otherwise it doesn't matter how many systems are behind
a single IP, because the numbers of "1" pings per 24h window tell you
pretty excactly how many systems there were...

Lennart

--
Lennart Poettering, Red Hat
_______________________________________________
devel mailing list -- devel@xxxxxxxxxxxxxxxxxxxxxxx
To unsubscribe send an email to devel-leave@xxxxxxxxxxxxxxxxxxxxxxx
Fedora Code of Conduct: https://getfedora.org/code-of-conduct.html
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: https://lists.fedoraproject.org/archives/list/devel@xxxxxxxxxxxxxxxxxxxxxxx