Hi Mel, On 08/16/2013 01:12 AM, Mel Gorman wrote: > On Thu, Aug 15, 2013 at 03:58:20AM +0900, Minchan Kim wrote: >>> <SNIP> >>> >>> I do not believe this is a problem for zram as such because I do not >>> think it ever writes back to disk and is immune from the unpredictable >>> performance characteristics problem. The problem for zram using zsmalloc >>> is OOM killing. If it's used for swap then there is no guarantee that >>> killing processes frees memory and that could result in an OOM storm. >>> Of course there is no guarantee that memory is freed with zbud either but >>> you are guaranteed that freeing 50%+1 of the compressed pages will free a >>> single physical page. The characteristics for zsmalloc are much more severe. >>> This might be managable in an applicance with very careful control of the >>> applications that are running but not for general servers or desktops. >> >> Fair enough but let's think of current usecase for zram. >> As I said in description, most of user for zram are embedded products. >> So, most of them has no swap storage and hate OOM kill because OOM is >> already very very slow path so system slow response is really thing >> we want to avoid. We prefer early process kill to slow response. >> That's why custom low memory killer/notifier is popular in embedded side. >> so actually, OOM storm problem shouldn't be a big problem under >> well-control limited system. >> > > Which zswap could also do if > > a) it had a pseudo block device that failed all writes > b) zsmalloc was pluggable > I'll take a try soon! > I recognise this sucks because zram is already in the field but if zram > is promoted then zram and zswap will continue to diverge further with no > reconcilation in sight. > > Part of the point of using zswap was that potentially zcache could be > implemented on top of it and so all file cache could be stored compressed > in memory. AFAIK, it's not possible to do the same thing for zram because > of the lack of writeback capabilities. Maybe it could be done if zram > could be configured to write to an underlying storage device but it may > be very clumsy to configure. I don't know as I never investigated it and > to be honest, I'm struggling to remember how I got involved anywhere near > zswap/zcache/zram/zwtf in the first place. > >>> If it's used for something like tmpfs then it becomes much worse. Normal >>> tmpfs without swap can lockup if tmpfs is allowed to fill memory. In a >>> sane configuration, lockups will be avoided and deleting a tmpfs file is >>> guaranteed to free memory. When zram is used to back tmpfs, there is no >>> guarantee that any memory is freed due to fragmentation of the compressed >>> pages. The only way to recover the memory may be to kill applications >>> holding tmpfs files open and then delete them which is fairly drastic >>> action in a normal server environment. >> >> Indeed. >> Actually, I had a plan to support zsmalloc compaction. The zsmalloc exposes >> handle instead of pure pointer so it could migrate some zpages to somewhere >> to pack in. Then, it could help above problem and OOM storm problem. >> Anyway, it's a totally new feature and requires many changes and experiement. >> Although we don't have such feature, zram is still good for many people. >> > > And is zsmalloc was pluggable for zswap then it would also benefit. > >>> These are the sort of reason why I feel that zram has limited cases where >>> it is safe to use and zswap has a wider range of applications. At least >>> I would be very unhappy to try supporting zram in the field for normal >>> servers. zswap should be able to replace the functionality of zram+swap >>> by backing zswap with a pseudo block device that rejects all writes. I >> >> One of difference between zswap and zram is asynchronous I/O support. > > As zram is not writing to disk, how compelling is asynchronous IO? If > zswap was backed by the pseudo device is there a measurable bottleneck? > >> I guess frontswap is synchronous by semantic while zram could support >> asynchronous I/O. >> >>> do not know why this never happened but guess the zswap people never were >>> interested and the zram people never tried. Why was the pseudo device >>> to avoid writebacks never implemented? Why was the underlying allocator >>> not made pluggable to optionally use zsmalloc when the user did not care >>> that it had terrible writeback characteristics? >> >> I remember you suggested to make zsmalloc with pluggable for zswap. >> But I don't know why zswap people didn't implement it. >> >>> >>> zswap cannot replicate zram+tmpfs but I also think that such a configuration >>> is a bad idea anyway. As zram is already being deployed then it might get >> >> It seems your big concern of zsmalloc is fragmentaion so if zsmalloc can >> support compaction, it would mitigate the concern. >> > > Even if it supported zsmalloc I would still wonder why zswap is not using > it as a pluggable option :( > >>> promoted anyway but personally I think compressed memory continues to be >> >> I admit zram might have limitations but it has helped lots of people. >> It's not an imaginary scenario. >> > > I know. > >> Please, let's not do get out of zram from kernel tree and stall it on staging >> forever with preventing new features. >> Please, let's promote, expose it to more potential users, receive more >> complains from them, recruit more contributors and let's enhance. >> > > As this is already used heavily in the field and I am not responsible > for maintaining it I am not going to object to it being promoted. I can > always push that it be disabled in distribution configs as it is not > suitable for general workloads for reason already discussed. > > However, I believe that the promotion will lead to zram and zswap diverging > further from each other, both implementing similar functionality and > ultimately cause greater maintenance headaches. There is a path that makes Agree! I prefer this way too! > zswap a functional replacement for zram and I've seen no good reason why > that path was not taken. Zram cannot be a functional replacment for zswap > as there is no obvious sane way writeback could be implemented. Continuing > to diverge will ultimately bite someone in the ass. > -- Regards, -Bob -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@xxxxxxxxx. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@xxxxxxxxx"> email@xxxxxxxxx </a>