2014-12-18 7:19 GMT+08:00 Seth Jennings <sjennings@xxxxxxxxxxxxxx>: > On Tue, Dec 02, 2014 at 11:49:41AM +0900, Minchan Kim wrote: >> Recently, there was issue about zsmalloc fragmentation and >> I got a report from Juno that new fork failed although there >> are plenty of free pages in the system. >> His investigation revealed zram is one of the culprit to make >> heavy fragmentation so there was no more contiguous 16K page >> for pgd to fork in the ARM. >> >> This patchset implement *basic* zsmalloc compaction support >> and zram utilizes it so admin can do >> "echo 1 > /sys/block/zram0/compact" >> >> Actually, ideal is that mm migrate code is aware of zram pages and >> migrate them out automatically without admin's manual opeartion >> when system is out of contiguous page. Howver, we need more thinking >> before adding more hooks to migrate.c. Even though we implement it, >> we need manual trigger mode, too so I hope we could enhance >> zram migration stuff based on this primitive functions in future. >> >> I just tested it on only x86 so need more testing on other arches. >> Additionally, I should have a number for zsmalloc regression >> caused by indirect layering. Unfortunately, I don't have any >> ARM test machine on my desk. I will get it soon and test it. >> Anyway, before further work, I'd like to hear opinion. >> >> Pathset is based on v3.18-rc6-mmotm-2014-11-26-15-45. > > Hey Minchan, sorry it has taken a while for me to look at this. > > I have prototyped this for zbud to and I see you face some of the same > issues, some of them much worse for zsmalloc like large number of > objects to move to reclaim a page (with zbud, the max is 1). > > I see you are using zsmalloc itself for allocating the handles. Why not > kmalloc()? Then you wouldn't need to track the handle_class stuff and > adjust the class sizes (just in the interest of changing only what is > need to achieve the functionality). > > I used kmalloc() but that is not without issue as the handles can be > allocated from many slabs and any slab that contains a handle can't be > freed, basically resulting in the handles themselves needing to be > compacted, which they can't be because the user handle is a pointer to > them. > > One way to fix this, but it would be some amount of work, is to have the > user (zswap/zbud) provide the space for the handle to zbud/zsmalloc. > The zswap/zbud layer knows the size of the device (i.e. handle space) > and could allocate a statically sized vmalloc area for holding handles > so they don't get spread all over memory. I haven't fully explored this > idea yet. > > It is pretty limiting having the user trigger the compaction. Can we > have a work item that periodically does some amount of compaction? > Maybe also have something analogous to direct reclaim that, when > zs_malloc fails to secure a new page, it will try to compact to get one? > I understand this is a first step. Maybe too much. Yes, User do not know when to do the compaction. Actually, zsmalloc's responsibility is to keep the fragmentation in a low level. How about dynamically monitoring the fragmentation and do the compaction when there are too much fragmentation. I am working on another patch to collect statistics of zsmalloc objects. Maybe that will be helpful for this. Thanks. > > Also worth pointing out that the fullness groups are very coarse. > Combining the objects from a ZS_ALMOST_EMPTY zspage and ZS_ALMOST_FULL > zspage, might not result in very tight packing. In the worst case, the > destination zspage would be slightly over 1/4 full (see > fullness_threshold_frac) > > It also seems that you start with the smallest size classes first. > Seems like if we start with the biggest first, we move fewer objects and > reclaim more pages. > > It does add a lot of code :-/ Not sure if there is any way around that > though if we want this functionality for zsmalloc. > > Seth > >> >> Thanks. >> >> Minchan Kim (6): >> zsmalloc: expand size class to support sizeof(unsigned long) >> zsmalloc: add indrection layer to decouple handle from object >> zsmalloc: implement reverse mapping >> zsmalloc: encode alloced mark in handle object >> zsmalloc: support compaction >> zram: support compaction >> >> drivers/block/zram/zram_drv.c | 24 ++ >> drivers/block/zram/zram_drv.h | 1 + >> include/linux/zsmalloc.h | 1 + >> mm/zsmalloc.c | 596 +++++++++++++++++++++++++++++++++++++----- >> 4 files changed, 552 insertions(+), 70 deletions(-) >> >> -- >> 2.0.0 >> >> -- >> To unsubscribe, send a message with 'unsubscribe linux-mm' in >> the body to majordomo@xxxxxxxxx. For more info on Linux MM, >> see: http://www.linux-mm.org/ . >> Don't email: <a href=mailto:"dont@xxxxxxxxx"> email@xxxxxxxxx </a> > -- > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in > the body of a message to majordomo@xxxxxxxxxxxxxxx > More majordomo info at http://vger.kernel.org/majordomo-info.html > Please read the FAQ at http://www.tux.org/lkml/ -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@xxxxxxxxx. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@xxxxxxxxx"> email@xxxxxxxxx </a>