On Wed, 12 Jun 2019 21:59:03 -0700 Nadav Amit <namit@xxxxxxxxxx> wrote: > For efficient search of resources, as needed to determine the memory > type for dax page-faults, introduce a cache of the most recently used > top-level resource. Caching the top-level should be safe as ranges in > that level do not overlap (unlike those of lower levels). > > Keep the cache per-cpu to avoid possible contention. Whenever a resource > is added, removed or changed, invalidate all the resources. The > invalidation takes place when the resource_lock is taken for write, > preventing possible races. > > This patch provides relatively small performance improvements over the > previous patch (~0.5% on sysbench), but can benefit systems with many > resources. > --- a/kernel/resource.c > +++ b/kernel/resource.c > @@ -53,6 +53,12 @@ struct resource_constraint { > > static DEFINE_RWLOCK(resource_lock); > > +/* > + * Cache of the top-level resource that was most recently use by > + * find_next_iomem_res(). > + */ > +static DEFINE_PER_CPU(struct resource *, resource_cache); A per-cpu cache which is accessed under a kernel-wide read_lock looks a bit odd - the latency getting at that rwlock will swamp the benefit of isolating the CPUs from each other when accessing resource_cache. On the other hand, if we have multiple CPUs running find_next_iomem_res() concurrently then yes, I see the benefit. Has the benefit of using a per-cpu cache (rather than a kernel-wide one) been quantified? > @@ -262,9 +268,20 @@ static void __release_child_resources(struct resource *r) > } > } > > +static void invalidate_resource_cache(void) > +{ > + int cpu; > + > + lockdep_assert_held_exclusive(&resource_lock); > + > + for_each_possible_cpu(cpu) > + per_cpu(resource_cache, cpu) = NULL; > +} All the calls to invalidate_resource_cache() are rather a maintainability issue - easy to miss one as the code evolves. Can't we just make find_next_iomem_res() smarter? For example, start the lookup from the cached point and if that failed, do a full sweep? > + invalidate_resource_cache(); > + invalidate_resource_cache(); > + invalidate_resource_cache(); > + invalidate_resource_cache(); > + invalidate_resource_cache(); > + invalidate_resource_cache(); > + invalidate_resource_cache(); > + invalidate_resource_cache(); > + invalidate_resource_cache(); > + invalidate_resource_cache(); > + invalidate_resource_cache(); > + invalidate_resource_cache(); > + invalidate_resource_cache(); > + invalidate_resource_cache(); Ow. I guess the maintainability situation can be improved by renaming resource_lock to something else (to avoid mishaps) then adding wrapper functions. But still. I can't say this is a super-exciting patch :(