Hi Matthew, On Sun, Feb 19, 2023 at 04:31:33AM +0000, Matthew Wilcox wrote: > > I think an overhaul of the swap code is long overdue. I appreciate > you're very much focused on zswap, but there are many other problems. > For example, swap does not work on zoned devices. Swap readahead is > generally physical (ie optimised for spinning discs) rather than logical > (more appropriate for SSDs). Swap's management of free space is crude > compared to real filesystems. The way that swap bypasses the filesystem > when writing to swap files is awful. I haven't even started to look at Can you expand a bit on that? I assume you want to see the swap file behavior more like a normal file system and reuse more of the readpage() and writepage() path. > what changes need to be made to swap in order to swap out arbitrary-order > folios (instead of PMD-sized + PTE-sized). When the page fault happens, does the whole folios get swapped in or break into smaller pages? > I'm probably not a great person to participate in the design of a > replacement system. I don't know nearly enough about anonymous memory. > I'd be sitting in the back shouting unhelpful things like, "Can't you > see an anon_vma is the exact same thing as an inode?" and "Why don't > we steal the block allocation functions from XFS?" and "Why do tmpfs I notice the swap_map has one byte per swap entry even the swap is not used. > pages have to move to the swap cache; can't we just leave them in the > page cache and pass them to the swap code directly?" All great suggestions and I am very interested in that. Chris