On Fri, Feb 02, 2024 at 09:09:49AM +0000, David Howells wrote: > Hi, > > The topic came up in a recent discussion about how to deal with large folios > when it comes to swap as a swap device is normally considered a simple array > of PAGE_SIZE-sized elements that can be indexed by a single integer. > > With the advent of large folios, however, we might need to change this in > order to be better able to swap out a compound page efficiently. Swap > fragmentation raises its head, as does the need to potentially save multiple > indices per folio. Does swap need to grow more filesystem features? The "file-based swap" infrastructure needs to be converted to use filesystem direct IO methods. It should not cache the extent list and do raw direct-to-device IO itself, it should just build an iov that points to the pages and submit that to the filesystem DIO read/write path to do the mapping and submission to disk. If we tell the dio subsystem that it is IOCB_SWAP IO, then we can do things like ignore unwritten bits in the extent mappings so we don't have to do transactions to avoid unwritten conversion on write or do timestamp updates on the inode... -Dave. -- Dave Chinner david@xxxxxxxxxxxxx