On Wed, Dec 11, 2019 at 5:29 PM Jens Axboe <axboe@xxxxxxxxx> wrote: > > I'd very much argue that it IS a bug, maybe just doesn't show on your > system. Oh, I agree. But I also understand why people hadn't noticed, and I don't think it's all that critical - because if you do 1M iops cached, you're doing something really really strange. I too can see xas_create using 30% CPU time, but that's when I do a perf record on just kswapd - and when I actually look at it on a system level, it looks nowhere near that bad. So I think people should look at this. Part of it might be for Willy: does that xas_create() need to be that expensive? I hate how "perf" callchains work, but it does look like it is probably page_cache_delete -> xas_store -> xas_create that is the bulk of the cost there. Replacing the real page with the shadow entry shouldn't be that big of a deal, I would really hope. Willy, that used to be a __radix_tree_lookup -> __radix_tree_replace thing, is there perhaps some simple optmization that could be done on the XArray case here? Linus