On Fri, Feb 26, 2021 at 08:39:47AM -0800, Eric Biggers wrote: > On Fri, Feb 26, 2021 at 10:36:53AM +0100, David Sterba wrote: > > On Thu, Feb 25, 2021 at 10:50:56AM -0800, Eric Biggers wrote: > > > On Thu, Feb 25, 2021 at 02:26:47PM +0100, David Sterba wrote: > Okay so you have 128K to compress, but not in a virtually contiguous buffer, so > you need the algorithm to support streaming of 4K chunks. And the LZ4 > implementation doesn't properly support that. (Note that this is a property of > the LZ4 *implementation*, not the LZ4 *format*.) > > How about using vm_map_ram() to get a contiguous buffer, like what f2fs does? > Then you wouldn't need streaming support. > > There is some overhead in setting up page mappings, but it might actually turn > out to be faster (also for the other algorithms, not just LZ4) since it avoids > the overhead of streaming, such as the algorithm having to copy all the data > into an internal buffer for matchfinding. Yes the mapping allows to compress the buffer in one go but the overhead is not small. I had it in one of the prototypes back then too but did not finish it because it would mean to update the on-disk compression container format. vm_map_ram needs to be called twice per buffer (both compression and decompressin), there are some additional data allocated, the virual aliases have to be flushed each time and this could be costly as I'm told (TLB shootdowns, IPI). Also vm_map_ram is deadlock prone because it unconditionally allocates with GFP_KERNEL, so the scoped NOFS protection has to be in place. And F2FS does not do that, but that's fixable.