On Fri, Apr 09, 2021 at 07:18:42PM -0400, Waiman Long wrote: > Most kmem_cache_alloc() calls are from user context. With instrumentation > enabled, the measured amount of kmem_cache_alloc() calls from non-task > context was about 0.01% of the total. > > The irq disable/enable sequence used in this case to access content > from object stock is slow. To optimize for user context access, there > are now two object stocks for task context and interrupt context access > respectively. > > The task context object stock can be accessed after disabling preemption > which is cheap in non-preempt kernel. The interrupt context object stock > can only be accessed after disabling interrupt. User context code can > access interrupt object stock, but not vice versa. > > The mod_objcg_state() function is also modified to make sure that memcg > and lruvec stat updates are done with interrupted disabled. > > The downside of this change is that there are more data stored in local > object stocks and not reflected in the charge counter and the vmstat > arrays. However, this is a small price to pay for better performance. I agree, the extra memory space is not a significant concern. I'd be more worried about the code complexity, but the result looks nice to me! Acked-by: Roman Gushchin <guro@xxxxxx> Btw, it seems that the mm tree ran a bit off, so I had to apply this series on top of Linus's tree to review. Please, rebase. Thanks!