On Fri, 18 Jun 2021 at 01:51, Gerd Hoffmann <kraxel@xxxxxxxxxx> wrote: > > Hi, > > > The problems with this is that we are taking a fairly fuzzy data set > > and making it much easier to track individual users in ways seen as > > problematic by various laws and regulations. > > Well, depends on how you store the data. You can store one record per > machine (with all properties in there), or you can store one record per > property per machine. > > With the latter you basically kill query on subgroups (like "how many > x86_64-v3 machines use UEFI?") because that grouping information is gone > if you store each end every little piece of information in its own > record. But it'll also much harder to do fingerprinting on such a data > base ... > > Standard disclaimer: IANAL. > The problem with IANAL, is that we all come up with great solutions which seem to match the single document we read. However the law is an interpreted language where every court is a slightly different architecture and has different libraries which have to be slowly interpreted and patched at a top level. This means that you end up with finding out that the document and 2500 years of law rulings have to be interpreted together. The way things are interpreted currently, it doesn't matter that you stored it differently.. it matters that you collected it... mainly because there is a long history of people finding ways to de-anonymize data, people lying about anonymizing it, and people somehow collecting the data in the middle. Because of that you end up having to delete all the data when someone asks to be deleted because you can't prove this record/count was their system or not. In general we computer people like to dive in and just collect data and go about doing analysis. The various privacy laws are written to make us do a LOT of hard work before we start doing that. You end up spending a lot of time with lawyers versed in European, Brazilian, and various other countries laws/regulations/past history to figure out what you can collect, how you can collect it, how you are going to delete it, how you are going to inform people that things are happening, and having clear processes that are followed. Then you can start writing the code.. while doing that you have to review the code to make sure it is still meeting current rulings. [Doing it another way ends up with you writing code and either finding you have to delete it all or waiting months for an approval before rolling it out.] > take care, > Gerd > _______________________________________________ > devel mailing list -- devel@xxxxxxxxxxxxxxxxxxxxxxx > To unsubscribe send an email to devel-leave@xxxxxxxxxxxxxxxxxxxxxxx > Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ > List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines > List Archives: https://lists.fedoraproject.org/archives/list/devel@xxxxxxxxxxxxxxxxxxxxxxx > Do not reply to spam on the list, report it: https://pagure.io/fedora-infrastructure -- Stephen J Smoogen. I've seen things you people wouldn't believe. Flame wars in sci.astro.orion. I have seen SPAM filters overload because of Godwin's Law. All those moments will be lost in time... like posts on BBS... time to reboot. _______________________________________________ devel mailing list -- devel@xxxxxxxxxxxxxxxxxxxxxxx To unsubscribe send an email to devel-leave@xxxxxxxxxxxxxxxxxxxxxxx Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/devel@xxxxxxxxxxxxxxxxxxxxxxx Do not reply to spam on the list, report it: https://pagure.io/fedora-infrastructure