> On Mon, 5 Nov 2018 at 09:56, Anatoli Babenia <anatoli(a)rainforce.org> wrote: > > I think the page should be archived/removed. Mainly because a lot of > the questions people want answers for usually also get in the way of > people wanting privacy. I agree that the page in its current state is not useful, but why do you propose to censor information about how Fedora handles privacy instead of explaining it on a case by case basis? Without statistics people are pretty much limited in synchronizing the view of the world to make a joint action. For example, with qdigidoc stats we could try to get some funding for Fedora development from EU. And also it could be an opt-in feature like https://popcon.debian.org/ I am not saying that the stats are reflecting anything, but with some adjustments they still can be useful. > Currently there is no way to know what > packages are being installed/downloaded the most. yum and dnf > downloads not provide those answers on purpose (it would require more > computational power on the servers than we have and it can't be easily > made anonymous. The data we can get is only basic information like > 'what version of yum/dnf used', 'what arch was asked for', 'what was > the version of Fedora/EPEL wanted' and 'what was the public ip > address'. This loses all kinds of additional information and masks > things like proxies, mock builds, etc which inflate/deflate numbers in > different ways. Just a hypothesis. If HTTPS/SSH and dnf protocol uses fixed size packets and encryption increases the size proportionally, then I can guess the combination of packages being installed based on time of request and request size, so it doesn't help to hide that. Recording IP is a big deal on its own. But for stats it can be replaced with just increasing counter. And you also forgot to mention about virtual machines and containers that also inflate the numbers. I don't believe that right now anybody has the incentive to keep the numbers on usage for `qdigidoc` higher than a real usage, and even if that's the case, the guys from the other side can validate the data according to the number of sessions with unique ID cards from Fedora to their servers. That's the whole point of it - making the first step to go further and pass the ball to the other side. Also from file serving mirrors I'd expect the bottleneck to be in a bandwidth and not processing power. Storing IPs for each can be inefficient, but can we get some statistics about that? I could not find any example mirror at https://nagios.fedoraproject.org/nagios/ > That page is even older than the one you pointed to and should also be > archived/removed. We are probably on Statistics 5.0 > > > I am sorry but there is no way to answer that question. I want to believe, but because you touched my paranoia from the start, is there a dump of client server session with logs to do a proper privacy audit? Now I need to feed the lawyer inside. :D _______________________________________________ infrastructure mailing list -- infrastructure@xxxxxxxxxxxxxxxxxxxxxxx To unsubscribe send an email to infrastructure-leave@xxxxxxxxxxxxxxxxxxxxxxx Fedora Code of Conduct: https://getfedora.org/code-of-conduct.html List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/infrastructure@xxxxxxxxxxxxxxxxxxxxxxx