On Sun, Jan 13, 2019 at 11:54 AM Stephen John Smoogen <smooge@xxxxxxxxx> wrote: > > > > On Sat, 12 Jan 2019 at 22:25, Nico Kadel-Garcia <nkadel@xxxxxxxxx> wrote: >> >> On Fri, Jan 11, 2019 at 4:37 PM Roberto Ragusa <mail@xxxxxxxxxxxxxxxx> wrote: >> > >> > On 1/8/19 4:22 PM, Lennart Poettering wrote: >> > >> > > If all you want to do is count, then it should be entirely sufficient >> > > to do it like this: >> > > >> > > GET /metalink?repo=fedora-28&arch=x86_64&edition=<blah>&countme=1 HTTP/1.1 >> > > >> > > the first time within each one-week window and a simple >> > > >> > > GET /metalink?repo=fedora-28&arch=x86_64&edition=<blah> HTTP/1.1 >> > > >> > > all other times. >> > >> > As an additional improvement, is it really needed to count every machine? >> > We can subsample a lot, and only let some specific machines to show >> > up for counting. >> >> The difficulty is not the counting. Requiring safe counting and >> aggregation by the server is a requirement that no server or >> intermediate server or proxy needs to follow, and would require >> configuration or filtering control of a server that is outside of >> client hands. It's not legally or technologically mandated. The great >> use fo r the data is tracking hosts, metadata that is saleable and >> likely to help provide a new form of tracking information. >> >> Writing this into the dnf behavior is typical, but i't's not >> beneficial to the clients. It's beneficial to the mirrors, who are >> likely to sell the data. While it may be that infamous problem, a >> "Simple Matter Of Programming(tm)" to sanitize the data, there are >> strong motivations to collect it and sell it, and I'd expect various >> mirrors to start doing so within moments of the activation of the >> feature. > > > 1. The mirrors do not see this. If it's not available to the mirrors, then anyone who hardcodes a mirror's URL into the local "baseurl" settings is not going to be counted this way, and we're back at the "we don't know how many clients there are" problem. If only the "mirrorlist" hosts see the UUID, "countme" or any other identical client ID. > 2. We aren't talking about UUIDs anymore and just a countme variable being sent periodically. If a countme is going to be too much data to send, then clients are probably already sending way too much data already. Then can we change the title of the thread? If the "countme" variable is unique and sent only to the host providing the mirrorlist, it's tracking data. That host becomes responsible for anonymization, and it is *too late* unless the data encrypted at the client, say with the GPG key of the relevant repository, and that starts requiring GPG private keys on the host providing the mirrorlist. If it's bonig across the wire, even with SSL, man-in-the-middle is an old, old problem. Whether the mirrorlist back end software is promised to be sanitized, it's tracking data. Sadly, I've been through this in other venues. The data was considerd "safe" because it was "anonymized". Except that the original web traffic was tappable, along with IP addresses and unique client information. A subpoena, a Patriot Act request, or even a foreign worker with an H1-B visa reporting back to foreign intelligence or a technology competitor could obtain a great deal of trackable data. Am I paranoid? Yes. Am i paranoid *enough*? I'm not so sure, we've seen assembly of pseudonymous data and metadata throughout the history of intelligence work. Demanding it, and handling it safely, is often an exercise in people claiming "no one would do that!", "no one would bother to investigate that", and people misusing it as a matter of course. I'd suggest it's not even worth the effort to demand or to collect with such concerns. Nico Kadel-Garcia <nkadel@xxxxxxxxx> _______________________________________________ devel mailing list -- devel@xxxxxxxxxxxxxxxxxxxxxxx To unsubscribe send an email to devel-leave@xxxxxxxxxxxxxxxxxxxxxxx Fedora Code of Conduct: https://getfedora.org/code-of-conduct.html List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/devel@xxxxxxxxxxxxxxxxxxxxxxx