Re: F40 Change: Privacy-preserving Telemetry for Fedora Workstation (System-Wide)

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi,


On 7/6/23 11:10, Aoife Moloney wrote:
Important process note: we are experimenting with using Fedora
(trimming stuff because this proposal is huge)


We intend to deploy the Endless OS metrics system.
[https://blogs.gnome.org/wjjt/2023/07/05/endless-oss-privacy-preserving-metrics-system/
This blog post] contains a description of how the system works. We do
not plan to deploy the eos-phone-home component in Fedora.

So, the following is just _my_ opinion, don't read more than that into it:


Having finally had a chance to look at the list of collected metrics i'm a bit worried about just how much information is being/can be gathered by the project, as well as the frequency it is being gathered.

Personally, I think it would benefit fedora if questions such as "is anyone actually using this hardware/driver/package" could be answered. OTOH, the metrics presented above go far beyond that. I'm not sure why its necessary to know how many times, or how long a particular application is being used.



=== How will data collection be approved? ===

The proposal owners feel it is essential to ensure the Fedora
community has ultimate oversight over metrics collection. Community
control is required to maintain user trust. If this change proposal is
approved, then we'll need new policies and procedures to ensure
community oversight over metrics collection and ensure Fedora users
can be confident that our metrics collection does not violate their
privacy.

So, I would suggest that the intended metrics are included as part of this proposal as well as the interval, and that it wouldn't be changed without further community approval. Doing this would go a long way to convincing me, and likely others, that its not worth the effort to manually rip the entire subsystem out of fedora at the first chance on my machines.

If there is to be a "process" for changing them, then I think that needs to be documented here rather than hand waving it away too.


We can say "we would never collect personally-identifiable data" and
write software that really doesn't collect any such data, but this
alone will never be enough to ensure user confidence. We will need a
metrics collection policy that describes what sort of data may be
collected by Fedora (anonymous, non-invasive), and what sort of data
may not be collected. Such a policy does not exist currently. We will
also want to ensure the Fedora community has ultimate control over
which particular metrics are collected. One option is that each metric
to be collected should be separately approved by FESCo. Collection of
particular metrics in a particular data format is ultimately an
engineering decision, and therefore FESCo seems like an appropriate
approval point. Because FESCo members are elected regularly by the
Fedora community, this also provides the community with ultimate
control over metrics collection via the election process. But other
oversight and approval structures would work too.

=== What data might we collect? ===

We are not proposing to collect any of these particular metrics just
yet, because a process for Fedora community approval of metrics to be
collected does not yet exist. That said, in the interests of maximum
transparency, we wish to give you an idea of what sorts of metrics we
might propose to collect in the future.

One of the main goals of metrics collection is to analyze whether Red
Hat is achieving its goal to make Fedora Workstation the premier
developer platform for cloud software development. Accordingly, we
want to know things like which IDEs are most popular among our users,
and which runtimes are used to create containers using Toolbx.


IMHO, the data shouldn't be collected more frequently than every 6 months or so, which allows each collection to be presented to the user, rather than having it just uploading the data in the background. Nor should it be tracking _user_ actions, which I would differentiate from machine state (bios machine type, RAM, installed packages, application crashes, failed suspend/resume, kinds of things).


But given course grained tracking, why isn't it part of server/IoT/etc as well, other than the current focus on gnome? Surely knowing that only one user is running $APPLICATION on a server is useful too.


Metrics can also be used to inform user interface design decisions.
For example, we want to collect the clickthrough rate of the
recommended software banners in GNOME Software to assess which banners
are actually useful to users. We also want to know how frequently
panels in gnome-control-center are visited to determine which panels
could be consolidated or removed, because there are other settings we
want to add, but our usability research indicates that the current
high quantity of settings panels already makes it difficult for users
to find commonly-used settings.

(trimming)

=== User control ===

A new metrics collection setting will be added to the privacy page in
gnome-initial-setup and also to the privacy page in
gnome-control-center. This setting will be a toggle that will enable
or disable metrics collection for the entire system. We want to ensure
that metrics are never submitted to Fedora without the user's
knowledge and consent, so the underlying setting will be off by
default in order to ensure metrics upload is not unexpectedly turned
on when upgrading from an older version of Fedora. However, we also
want to ensure that the data we collect is meaningful, so
gnome-initial-setup will default to displaying the toggle as enabled,
even though the underlying setting will initially be disabled. (The
underlying setting will not actually be enabled until the user
finishes the privacy page, to ensure users have the opportunity to
disable the setting before any data is uploaded.) This is to ensure
the system is opt-out, not opt-in. This is essential because we know
that opt-in metrics are not very useful. Few users would opt in, and
these users would not be representative of Fedora users as a whole. We
are not interested in opt-in metrics.


I also think its useful here to describe _exactly_ how to disable/remove the component, as well as where the opt-in/out settings are stored in the filesystem, how to change it, and where the log of reported data for a given machine can be retrieved.




To make this a little more confusing, metrics collection is actually
separate from uploading. Collection is always initially enabled, while
uploading is always initially disabled. The graphical toggle enables
or disables both at the same time. That is, a newly-installed Fedora
system will always collect metrics locally at first, but the collected
metrics will be deleted and never submitted to Fedora if the user
disables the metrics collection toggle on the privacy page. If the
user leaves the toggle enabled, then the collected metrics may be
submitted only after finishing the privacy page.


(trimmed rest)

Thanks for getting this far.
_______________________________________________
devel mailing list -- devel@xxxxxxxxxxxxxxxxxxxxxxx
To unsubscribe send an email to devel-leave@xxxxxxxxxxxxxxxxxxxxxxx
Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: https://lists.fedoraproject.org/archives/list/devel@xxxxxxxxxxxxxxxxxxxxxxx
Do not reply to spam, report it: https://pagure.io/fedora-infrastructure/new_issue




[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [Fedora Announce]     [Fedora Users]     [Fedora Kernel]     [Fedora Testing]     [Fedora Formulas]     [Fedora PHP Devel]     [Kernel Development]     [Fedora Legacy]     [Fedora Maintainers]     [Fedora Desktop]     [PAM]     [Red Hat Development]     [Gimp]     [Yosemite News]

  Powered by Linux