On Tue, Dec 14, 2021 at 09:48:58AM -0500, Carlos O'Donell wrote: > The Fedora SOS reports are ~30MiB today, and this exceeds the > Bugzilla attachment limit of 19.5MiB. Rather than going straight to raising BZ limits, I think we have some more basic questions that should be considered first First, it is reasonable for the Fedora SOS reports to be such a large size by default ? Second, is Fedora setup to handle these user SOS reports (that are likely to contain large amounts of sensitive PII - Personally Identifiable Information) in a way that respects data protection rules or best practices. Bear in mind that the Fedora bug tracker defaults to everything being public unless you remember to tick the 'private' box, and once attached never expires. TL;DR, I'm sceptical that Fedora maintainers should be handling sosreports at all in bugzilla right nwo. I just generated a sosreport on my Fedora host and it indeed came out at the kind of scale you mention - for me 20 MB compressed, and 375 MB uncompressed. Mine had 100 MB of 'journalctl' output dating back over 6 months. I struggle to come up with any common scenarios in which debugging is going to require 6 months of logs from my server. IMHO most bugs where logs are relevant would only need 24 hours worth of data to diagnose. In the rare cases where more is needed the user could be asked for that separately from a sosreport. IOW That 100 MB could easily be a mere few MB instead. In another directory it then has another 35 MB of journalctl output, split across 2 files which are identical, dating back ~1 month from the last time I booted. Again this feels very excessive. I can understand wanting to get kernel boot up messages, which might be quite some time ago, but that does not imply we need everything in between the initial boot up and today. IOW just journal logs is 1/3 of the total data size in my sosreport ! Looking at another random large file 'ss_-peaonmi.tailed', at 25 MB in size. It seems to be listing open socket information. For some bizarre reason every line in it starts with ~4KB of whitespace, and then about 200 bytes of actual data. IOW, that 25 MB could be less than 0.5 MB if all the extraneous whitespace wasn't there. At least whitespace compresses well, but still... Overall, sosreport sizes feel pretty excessive and have scope for more tailored data collection, without terribly compromising the usefulness. On the second question, there is a significant amount of data in the sosreports that is likely to be sensitive, and thus raises questions about data protection / PII handling policies if we request it from users in a public facing bug tracker. The 'sos' tool prints a warning message telling you to analyse the the contents of the report before making it available. This suggestion doesn't really feel credible when it contains nearly 400 MB of data to look at. IMHO we have to assume that any sosreport we receive will not have been scrubbed and thus be full of sensitive information. Some might say that we already ask users to attach sensitive info to bugs routinely, so how is this different ? To some extent that is correct. As maintainers we often ask for logs, config files, and so on that can contain sensitive information. The difference though, IMHO, is the scale. A single file request is easily examined by the bug reporter to check if they'd be exposing something sensitive. It is also a proportionate amount of information to request for the task of investigating a bug. A sosreport contains a tonne of useful info, but for any single bug the vast majority is irrelevant. So it is much harder to argue that requesting this sos report info is proportionate for solving bugs from Fedora users, especially when attachments default to public and never expire. Regards, Daniel -- |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o- https://fstop138.berrange.com :| |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :| _______________________________________________ devel mailing list -- devel@xxxxxxxxxxxxxxxxxxxxxxx To unsubscribe send an email to devel-leave@xxxxxxxxxxxxxxxxxxxxxxx Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/devel@xxxxxxxxxxxxxxxxxxxxxxx Do not reply to spam on the list, report it: https://pagure.io/fedora-infrastructure