On Thu 07-03-24 13:06:27, Jared Hulbert wrote: > On Thu, Mar 7, 2024 at 9:35 AM Jan Kara <jack@xxxxxxx> wrote: > > > > Well, but then if you fill in space of a particular order and need to swap > > out a page of that order what do you do? Return ENOSPC prematurely? > > > > Frankly as I'm reading the discussions here, it seems to me you are trying > > to reinvent a lot of things from the filesystem space :) Like block > > allocation with reasonably efficient fragmentation prevention, transparent > > data compression (zswap), hierarchical storage management (i.e., moving > > data between different backing stores), efficient way to get from > > VMA+offset to the place on disk where the content is stored. Sure you still > > don't need a lot of things modern filesystems do like permissions,> directory structure (or even more complex namespacing stuff), all the stuff > > achieving fs consistency after a crash, etc. But still what you need is a > > notable portion of what filesystems do. > > > > So maybe it would be time to implement swap as a proper filesystem? Or even > > better we could think about factoring out these bits out of some existing > > filesystem to share code? > > Yes. Thank you. I've been struggling to communicate this. > > I'm thinking you can just use existing filesystems as a first step > with a modest glue layer. See the branch of this thread where I'm > babbling on to Chris about this. > > "efficient way to get from VMA+offset to place on the disk where > content is stored" > You mean treat swapped pages like they were mmap'ed files and use the > same code paths? How big of a project is that? That seems either > deceptively easy or really hard... I've been away too long and was > never really good enough to have a clear vision of the scale. Well, conceptually it is easy to consider anonymous VMA as a mapping of some file (anon pages are the page cache of this file) and swapout is just writeback event. But I suspect the details are going to get hairy with this concept - filesystems are generally optimized for handling large contiguous blocks where as anon memory is much more random access, also filesystem code does not expect to be run from reclaim context so locking and memory demands might be a problem. So although the unification of anon and file backed memory is intriguing, I didn't mean to go *this* far :) I rather meant we could export some functional blocks like block allocator from some filesystem as a library swap code could use. Honza -- Jan Kara <jack@xxxxxxxx> SUSE Labs, CR