On 16/02/2021 17:05, Peter Robinson wrote: >> On 15/02/2021 19:47, Gary Buhrmaster wrote: >>> On Mon, Feb 15, 2021 at 6:39 PM Dan Horák <dan@xxxxxxxx> wrote: >>> >>>> The open question still is whether we should try to keep 64k as default >>>> as it would allow to find the remaining bugs and offer 4k kernel variant >>>> (COPR for ppc64le should be coming back soon), similar for the >>>> installer (a new remix/spin). After BTRFS removes the page size >>>> dependency, switching the kernels shouldn't cause any issues for users. >>> >>> I think it may be instructive to look at the enabling IPv6 >>> had on the entire ecosystem (and going to ipv6-first >>> networking). Which definitely broke things (and there >>> remain, in the greater world, lots of things still broken >>> when IPv6 is enabled). However, if we still used ipv4-first >>> networking even more would almost certainly still be >>> broken, because no one would experience or report >>> the issues with IPv6. >>> >>> If you agree that fixing the 64K bugs are important >>> (and I personally think they are), you need to go >>> 64K first to get the reports, and get the fixes. >> >> >> The problem is that not all ppc64le bugs are related to page size > > Welcome to the world of non x86 architectures. Welcome or welcome back... I started on Color Computer 3 with Motorola 6809. >> I was recently looking at ffmpeg issues[1] that happen on any page size, >> that is now fixed and it also fixes issues in Blender. >> >> Going to 4k page size, we effectively drain the swamp to the half-water >> mark. Some bugs will go away, other bugs will still be there. > > It doesn't drain the swap at all, it just changes the water from one to another. My personal impression is that the combination of Btrfs page size and GPUs not working were a darker water. To get around that, I had to not only compile a kernel but also create a custom installer image with my kernel. I'm happy to share those things for other users. >> The volume of workstation bugs is actually quite intimidating. Even for >> somebody with a lot of experience, it takes away a certain amount of >> energy. Some users and maybe even some developers will spend so much >> time on hacks and workarounds that they have no time or energy left to >> report the bugs, bisect them or even fix them. > > At least you don't have to deal with big endian bugs in there too, and > a bunch of us that have been working on non x86 architectures for > years have no doubt solved a number of the problems already. aarch64 > had a mix of 4K (Fedora) and 64K (RHEL) and we've dealt with 100s of > these already, of course that doesn't rule out POWER specific ones. Actually, as an upstream developer, when politics doesn't prevent me uploading packages to every distribution, I would carefully check unit test results from all architectures on both Fedora and Debian. If builds failed on a specific architecture or big endian I would make the effort to support it. But I understand some people spend a far greater percentage of their time on that than me and I'm glad so many things just work already on POWER9. >> But I do agree that we can't avoid 64k indefinitely. If there is a way >> to support both page sizes and run unit tests for all packages on both >> that would be really useful. In addition to unit tests, it would be >> useful to have a manual check on Firefox, Thunderbird, LibreOffice, etc >> before each major release on 64k. > > No easily, apparently having something you can set via a kernel > command line for this stuff isn't straight forward, I started asking > for that functionality for pages sizes back in the early days of > aarch64 and I'm still waiting. I wasn't really thinking about a runtime option, I was thinking about two completely parallel environments, each with their own copies of userland compiled on kernels with the corresponding page size. Beyond the unit tests, it would also be interesting to use reproducible builds methods to compare userland binaries and see if they vary depending on the page size of the host where they were built. This could flush out more problems. >> 64k issues for ppc64le will also get more attention when other >> architectures go 64k, then we won't have all the pressure on ppc64le >> users. IPv6 was for every architecture so the effort was spread a lot >> more widely. > > aarch64 also has 64K page sizes so it's already a shared problem, and > I've dealt with and fixed more bug around that I care to remember but > you're not the only one that's had to deal with it. Yes and I've seen more reports from people trying that platform too, for example, the recent blog[1] about the HoneyComb Are there other ways we can collaborate on this, for example, with a wiki page about known 64k issues? Does anybody want to capture anything from this thread in the wiki page for the change[2] or is there any other place where it would be useful to have a summary? Regards, Daniel 1. https://fedoramagazine.org/fedora-aarch64-on-the-solidrun-honeycomb-lx2k/ 2. https://fedoraproject.org/wiki/Changes/Power4kPageSize _______________________________________________ devel mailing list -- devel@xxxxxxxxxxxxxxxxxxxxxxx To unsubscribe send an email to devel-leave@xxxxxxxxxxxxxxxxxxxxxxx Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/devel@xxxxxxxxxxxxxxxxxxxxxxx Do not reply to spam on the list, report it: https://pagure.io/fedora-infrastructure