On Thu, Nov 7, 2024 at 11:25 PM Barry Song <21cnbao@xxxxxxxxx> wrote: > > On Thu, Nov 7, 2024 at 5:23 AM Usama Arif <usamaarif642@xxxxxxxxx> wrote: > > > > > > > > On 22/10/2024 00:28, Barry Song wrote: > > >> From: Tangquan Zheng <zhengtangquan@xxxxxxxx> > > >> > > >> +static int zram_bvec_write_multi_pages(struct zram *zram, struct bio_vec *bvec, > > >> + u32 index, int offset, struct bio *bio) > > >> +{ > > >> + if (is_multi_pages_partial_io(bvec)) > > >> + return zram_bvec_write_multi_pages_partial(zram, bvec, index, offset, bio); > > >> + return zram_write_page(zram, bvec->bv_page, index); > > >> +} > > >> + > > > > Hi Barry, > > > > I started reviewing this series just to get a better idea if we can do something > > similar for zswap. I haven't looked at zram code before so this might be a basic > > question: > > How would you end up in zram_bvec_write_multi_pages_partial if using zram for swap? > > Hi Usama, > > There’s a corner case where, for instance, a 32KiB mTHP is swapped > out. Then, if userspace > performs a MADV_DONTNEED on the 0~16KiB portion of this original mTHP, > it now consists > of 8 swap entries(mTHP has been released and unmapped). With > swap0-swap3 released > due to DONTNEED, they become available for reallocation, and other > folios may be swapped > out to those entries. Then it is a combination of the new smaller > folios with the original 32KiB > mTHP. Sorry, I forgot to mention that the assumption is ZSMALLOC_MULTI_PAGES_ORDER=3, so data is compressed in 32KiB blocks. With Chris' and Kairui's new swap optimization, this should be minor, as each cluster has its own order. However, I recall that order-0 can still steal swap slots from other orders' clusters when swap space is limited by scanning all slots? Please correct me if I'm wrong, Kairui and Chris. > > > > > We only swapout whole folios. If ZCOMP_MULTI_PAGES_SIZE=64K, any folio smaller > > than 64K will end up in zram_bio_write_page. Folios greater than or equal to 64K > > would be dispatched by zram_bio_write_multi_pages to zram_bvec_write_multi_pages > > in 64K chunks. So for e.g. 128K folio would end up calling zram_bvec_write_multi_pages > > twice. > > In v2, I changed the default order to 2, allowing all anonymous mTHP > to benefit from this > feature. > > > > > Or is this for the case when you are using zram not for swap? In that case, I probably > > dont need to consider zram_bvec_write_multi_pages_partial write case for zswap. > > > > Thanks, > > Usama > Thanks barry