On Wed, Nov 27, 2024 at 6:04 PM Sergey Senozhatsky <senozhatsky@xxxxxxxxxxxx> wrote: > > On (24/11/27 09:31), Barry Song wrote: > > On Tue, Nov 26, 2024 at 11:53 PM Sergey Senozhatsky > > <senozhatsky@xxxxxxxxxxxx> wrote: > > > > > > On (24/11/26 14:09), Sergey Senozhatsky wrote: > > > > > swap-out time(ms) 68711 49908 > > > > > swap-in time(ms) 30687 20685 > > > > > compression ratio 20.49% 16.9% > > > > > > I'm also sort of curious if you'd use zstd with pre-trained user > > > dictionary [1] (e.g. based on a dump of your swap-file under most > > > common workloads) would it give you desired compression ratio > > > improvements (on current zram, that does single page compression). > > > > > > [1] https://github.com/facebook/zstd?tab=readme-ov-file#the-case-for-small-data-compression > > > > Not yet, but it might be worth trying. A key difference between servers and > > Android phones is that phones have millions of different applications > > downloaded from the Google Play Store or other sources. > > Maybe yes maybe not, I don't know. It could be that that 99% of users > use the same 1% apps out of those millions. > > > In this case, would using a dictionary be a feasible approach? Apologies > > if my question seems too naive. > > It's a good question, and there is probably only one way to answer > it - through experiments, it's data dependent, so it's case-by-case. Sure, we may collect data on the most popular apps (e.g., the top 100) and train zstd using their anonymous data to identify patterns. We’ll follow up with you afterward. > > > On the other hand, the advantage of a pre-trained user dictionary > > doesn't outweigh the benefits of large block compression? Can’t both > > be used together? > > Well, so far the approach has many unmeasured unknowns and corner > cases, I don't think I personally even understand all of them to begin I agree we can make an effort to dig deeper and collect more data, analyzing as many corner cases as possible but many unknowns are a common characteristic of new things :-) > with. Not sure if I have a way to measure and analyze, that mTHP > swapout seems like a relatively new thing and it also seems that you > are still fixing some of its issues/shortcomings. A challenge is determining how to make mTHP fully transparent (e.g., not dependent on sysfs controls for enabling/disabling) across various workloads. The default policy may not always be optimal for all workloads. Despite that, there are certainly benefits we can gain from mTHP within zsmalloc/zram. Thanks Barry