Hi Dave, Thanks long explanation. > > Secondly, You misparsed "avoid direct reclaim" paragraph. We don't talk > > about "avoid direct reclaim even if system memory is no enough", We talk > > about "avoid direct reclaim by preparing before". > > I don't think I misparsed it. I am addressing the "avoid direct > reclaim by preparing before" principle directly. The problem with it > is that just enalrging the free memory pool doesn't guarantee future > allocation success when there are other concurrent allocations > occurring. IOWs, if you don't _reserve_ the free memory for the > critical area in advance then there is no guarantee it will be > available when needed by the critical section. Right. Then, I made per-task reserve memory code at very years ago when I'm working for embedded. So, There are some design choice here. best effort as Christoph described or per thread or RT thread specific reservation. > A simple example: the radix tree node preallocation code to > guarantee inserts succeed while holding a spinlock. If just relying > on free memory was sufficient, then GFP_ATOMIC allocations are all > that is necessary. However, even that isn't sufficient as even the > GFP_ATOMIC reserved pool can be exhausted by other concurrent > GFP_ATOMIC allocations. Hence preallocation is required before > entering the critical section to guarantee success in all cases. > > And to state the obvious: doing allocation before the critical > section will trigger reclaim if necessary so there is no need to > have the application trigger reclaim. Yes and No. Preallocation is core piece, yes. But Almost all syscall call kmalloc() implicitly. then mlock() is no sufficient preallocation. Almost all application except HPC can't avoid syscall use. That's the reason why finance people repeatedly requirest us the feature, I think. Thanks! -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html