On Thu, Sep 5, 2024 at 11:23 AM Usama Arif <usamaarif642@xxxxxxxxx> wrote: > > > > On 05/09/2024 00:10, Barry Song wrote: > > On Thu, Sep 5, 2024 at 9:30 AM Usama Arif <usamaarif642@xxxxxxxxx> wrote: > >> > >> > >> > >> On 03/09/2024 23:05, Yosry Ahmed wrote: > >>> On Tue, Sep 3, 2024 at 2:36 PM Barry Song <21cnbao@xxxxxxxxx> wrote: > >>>> > >>>> On Wed, Sep 4, 2024 at 8:08 AM Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx> wrote: > >>>>> > >>>>> On Tue, 3 Sep 2024 11:38:37 -0700 Yosry Ahmed <yosryahmed@xxxxxxxxxx> wrote: > >>>>> > >>>>>>> [ 39.157954] RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000007 > >>>>>>> [ 39.158288] R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000001 > >>>>>>> [ 39.158634] R13: 0000000000002b9a R14: 0000000000000000 R15: 00007ffd619d5518 > >>>>>>> [ 39.158998] </TASK> > >>>>>>> [ 39.159226] ---[ end trace 0000000000000000 ]--- > >>>>>>> > >>>>>>> After reverting this or Usama's "mm: store zero pages to be swapped > >>>>>>> out in a bitmap", the problem is gone. I think these two patches may > >>>>>>> have some conflict that needs to be resolved. > >>>>>> > >>>>>> Yup. I saw this conflict coming and specifically asked for this > >>>>>> warning to be added in Usama's patch to catch it [1]. It served its > >>>>>> purpose. > >>>>>> > >>>>>> Usama's patch does not handle large folio swapin, because at the time > >>>>>> it was written we didn't have it. We expected Usama's series to land > >>>>>> sooner than this one, so the warning was to make sure that this series > >>>>>> handles large folio swapin in the zeromap code. Now that they are both > >>>>>> in mm-unstable, we are gonna have to figure this out. > >>>>>> > >>>>>> I suspect Usama's patches are closer to land so it's better to handle > >>>>>> this in this series, but I will leave it up to Usama and > >>>>>> Chuanhua/Barry to figure this out :) > >>>> > >>>> I believe handling this in swap-in might violate layer separation. > >>>> `swap_read_folio()` should be a reliable API to call, regardless of > >>>> whether `zeromap` is present. Therefore, the fix should likely be > >>>> within `zeromap` but not this `swap-in`. I’ll take a look at this with > >>>> Usama :-) > >>> > >>> I meant handling it within this series to avoid blocking Usama > >>> patches, not within this code. Thanks for taking a look, I am sure you > >>> and Usama will figure out the best way forward :) > >> > >> Hi Barry and Yosry, > >> > >> Is the best (and quickest) way forward to have a v8 of this with > >> https://lore.kernel.org/all/20240904055522.2376-1-21cnbao@xxxxxxxxx/ > >> as the first patch, and using swap_zeromap_entries_count in alloc_swap_folio > >> in this support large folios swap-in patch? > > > > Yes, Usama. i can actually do a check: > > > > zeromap_cnt = swap_zeromap_entries_count(entry, nr); > > > > /* swap_read_folio() can handle inconsistent zeromap in multiple entries */ > > if (zeromap_cnt > 0 && zeromap_cnt < nr) > > try next order; > > > > On the other hand, if you read the code of zRAM, you will find zRAM has > > exactly the same mechanism as zeromap but zRAM can even do more > > by same_pages filled. since zRAM does the job in swapfile layer, there > > is no this kind of consistency issue like zeromap. > > > > So I feel for zRAM case, we don't need zeromap at all as there are duplicated > > efforts while I really appreciate your job which can benefit all swapfiles. > > i mean, zRAM has the ability to check "zero"(and also non-zero but same > > content). after zeromap checks zeromap, zRAM will check again: > > > > Yes, so there is a reason for having the zeromap patches, which I have outlined > in the coverletter. > > https://lore.kernel.org/all/20240627105730.3110705-1-usamaarif642@xxxxxxxxx/ > > There are usecases where zswap/zram might not be used in production. > We can reduce I/O and flash wear in those cases by a large amount. > > Also running in Meta production, we found that the number of non-zero filled > complete pages were less than 1%, so essentially its only the zero-filled pages > that matter. I don't have data on Android phones, i'd like to see if phones have exactly the same ratio that non-zero same page is rare. > > I believe after zeromap, it might be a good idea to remove the page_same_filled > check from zram code? Its not really a problem if its kept as well as I dont > believe any zero-filled pages should reach zram_write_page? > > > static int zram_write_page(struct zram *zram, struct page *page, u32 index) > > { > > ... > > > > if (page_same_filled(mem, &element)) { > > kunmap_local(mem); > > /* Free memory associated with this sector now. */ > > flags = ZRAM_SAME; > > atomic64_inc(&zram->stats.same_pages); > > goto out; > > } > > ... > > } > > > > So it seems that zeromap might slightly impact my zRAM use case. I'm not > > blaming you, just pointing out that there might be some overlap in effort > > here :-) > > > >> > >> Thanks, > >> Usama > > Thanks Barry