On Thu, 17 Jun 2021 at 12:27, Justin Forbes <jmforbes@xxxxxxxxxxx> wrote: > > On Wed, Jun 16, 2021 at 3:23 PM Matthew Miller <mattdm@xxxxxxxxxxxxxxxxx> wrote: > > > > On Wed, Jun 16, 2021 at 02:57:17PM +0200, Vitaly Zaitsev via devel wrote: > > > >We'll at least gather information about capabilities of Fedora > > > >users hardware. > > > Telemetry is evil. It must not be allowed. > > > > Well, that's certainly A Position. I don't think it's anything nearly so > > absolute, though, and depends on what, who, how, why, and a host of other > > things. And "it can help us answer questions like this for our community" is > > a pretty non-evil "why". > > I think there can be a lot of benefit in anonymized hardware data (not > mandatory). It does help answer questions like this, but more > importantly, it would make a lot of the kernel work a bit easier, or > at least more focused. It answers questions like, "should we enable > these drivers as they are likely to be used?" or "can we disable these > drivers because no one is using them?". It is also very helpful in > working out bug priority in drivers. A lot of people never bother > filing bugs, and are happy to keep booting a known good kernel since > we allow parallel installs. If we get a few users chiming in, and > realize the hardware in question is used by a significant chunk of > users, it would tell me that perhaps that should take priority over a > bug which impacts hardware with considerably fewer users. Yes, you > have to be extremely careful about what data you collect, and how that > data is handled, but if done correctly, there are a lot of benefits. > The major problem is that 'anonymized' data does not exist. Pretty much every method which says it 'anonymizes' stuff does not and can lead to a strong 'fingerprint' back to an individual or group. The only methods which truly do seem to stop this basically add so much random noise to the data that it is 'useless' for whatever analysis. [AKA you might as well just tell /dev/urandom to give you a couple of gigabytes of answers.] This means that any program you use to collect information needs to assume that it will have to be regularly purged/cleaned/etc. You will have to only take snapshots of very high level data to compare to other timeframes. It also has to assume that there are enough people who do not want to be watched (even if you ask for them to volunteer info) that they will feed you bad data. This is why we had more PDP-11's in smolt than we had some valid architectures we shipped. Once you start finding these limits, you realize you don't want to mix it with data you need for continual operation of service. If you do, you will lose that data regularly also, may be told to turn off, or find yourself spending too many resources to keep it. This is the reason I object to adding too much to mirrorlist data. That is a service which we need to keep up and we need to keep some history for general operations. [We need to know how many resources we are servicing and how long it takes to respond. Having multiple years helped show when the mirror program began to fall over from too many customers and the fact that the change in certain yum cron jobs were increasingly causing issues.] -- Stephen J Smoogen. I've seen things you people wouldn't believe. Flame wars in sci.astro.orion. I have seen SPAM filters overload because of Godwin's Law. All those moments will be lost in time... like posts on BBS... time to reboot. _______________________________________________ devel mailing list -- devel@xxxxxxxxxxxxxxxxxxxxxxx To unsubscribe send an email to devel-leave@xxxxxxxxxxxxxxxxxxxxxxx Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/devel@xxxxxxxxxxxxxxxxxxxxxxx Do not reply to spam on the list, report it: https://pagure.io/fedora-infrastructure