Re: [PATCH v7 2/2] mm: support large folios swap-in for sync io devices

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, Sep 23, 2024 at 11:22:30AM +0100, Usama Arif wrote:
> On 23/09/2024 00:57, Barry Song wrote:
> > On Thu, Sep 5, 2024 at 7:36 AM Yosry Ahmed <yosryahmed@xxxxxxxxxx> wrote:
> >>>> On the other hand, if you read the code of zRAM, you will find zRAM has
> >>>> exactly the same mechanism as zeromap but zRAM can even do more
> >>>> by same_pages filled. since zRAM does the job in swapfile layer, there
> >>>> is no this kind of consistency issue like zeromap.
> >>>>
> >>>> So I feel for zRAM case, we don't need zeromap at all as there are duplicated
> >>>> efforts while I really appreciate your job which can benefit all swapfiles.
> >>>> i mean, zRAM has the ability to check "zero"(and also non-zero but same
> >>>> content). after zeromap checks zeromap, zRAM will check again:
> >>>>
> >>>
> >>> Yes, so there is a reason for having the zeromap patches, which I have outlined
> >>> in the coverletter.
> >>>
> >>> https://lore.kernel.org/all/20240627105730.3110705-1-usamaarif642@xxxxxxxxx/
> >>>
> >>> There are usecases where zswap/zram might not be used in production.
> >>> We can reduce I/O and flash wear in those cases by a large amount.
> >>>
> >>> Also running in Meta production, we found that the number of non-zero filled
> >>> complete pages were less than 1%, so essentially its only the zero-filled pages
> >>> that matter.
> >>>
> >>> I believe after zeromap, it might be a good idea to remove the page_same_filled
> >>> check from zram code? Its not really a problem if its kept as well as I dont
> >>> believe any zero-filled pages should reach zram_write_page?
> >>
> >> I brought this up before and Sergey pointed out that zram is sometimes
> >> used as a block device without swap, and that use case would benefit
> >> from having this handling in zram. That being said, I have no idea how
> >> many people care about this specific scenario.
> > 
> > Hi Usama/Yosry,
> > 
> > We successfully gathered page_same_filled data for zram on Android.
> > Interestingly,
> > our findings differ from yours on zswap.
> > 
> > Hailong discovered that around 85-86% of the page_same_filled data
> > consists of zeros,
> > while about 15% are non-zero. We suspect that on Android or similar
> > systems, some
> > graphics or media data might be duplicated at times, such as a red
> > block displayed
> > on the screen.
> > 
> > Does this suggest that page_same_filled could still provide some
> > benefits in zram
> > cases?
> 
> Hi Barry,
> 
> Thanks for the data, its very interesting to know this from mobile side.
> Eventhough its not 99% that I observed, I do feel 85% is still quite high.

Would it be possible to benchmark Android with zram only optimizing
zero pages?

diff --git a/drivers/block/zram/zram_drv.c b/drivers/block/zram/zram_drv.c
index c3d245617083..f6ded491fd00 100644
--- a/drivers/block/zram/zram_drv.c
+++ b/drivers/block/zram/zram_drv.c
@@ -211,6 +211,9 @@ static bool page_same_filled(void *ptr, unsigned long *element)
 	page = (unsigned long *)ptr;
 	val = page[0];
 
+	if (val)
+		return false;
+
 	if (val != page[last_pos])
 		return false;
 
My take is that, if this is worth optimizing for, then it's probably
worth optimizing for in the generic swap layer too. It makes sense to
maintain feature parity if we one day want Android to work with zswap.

> The 2 main reasons for the patcheset to store zero pages to be
> swapped out in a bitmap were for applications that use swap but
> not zswap/zram (reducing I/O and flash wear), and simplifying zswap
> code. It also meant fewer zswap_entry structs in memory which would
> offset the memory usage by bitmap.
> 
> Yosry mentioned that Sergey pointed out that zram is sometimes
> used as a block device without swap, and that use case would benefit
> from having this handling in zram. Will that case also not go through
> swap_writepage and therefore be takencare of by swap_info_struct->zeromap?

No, it won't.

However, the write bios sent from swap have REQ_SWAP set. Zram could
make its same-filled optimization conditional on that flag if it wants
to maintain it for the raw block device use case.




[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux