On Tue, Sep 21, 2021 at 09:06:35AM +1000, Dave Chinner wrote: > FWIW, an example of avoidable runtime calculation overhead of > constants is xlog_calc_unit_res(). These values are actually > constant for a given transaction reservation, but at 1.6 million > transactions a second it shows up at #20 on the flat profile of > functions using the most CPU: > > 0.71% [kernel] [k] xlog_calc_unit_res > > 0.71% of 32 CPUs for 1.6 million calculations a second of the same > constants is a non-trivial amount of CPU time to spend doing > unnecessary repeated calculations. > > Even though the btree cursor constant calculations are simpler than > the log res calculations, they are more frequent. Hence on general > principles of efficiency, I don't think we want to be replacing high > frequency, low overhead slab/zone based allocations with heap > allocations that require repeated constant calculations and > size->slab redirection.... FWIW, I have another example that I don't have profiles for right now because I didn't record them in the patch series that ends up pre-calculating the AIL push target: xlog_grant_push_threshold(). This threshold is largely a fixed value ahead of the current log tail (push at >75% of the physical log spacei consumed). We do that calculation more often than we call xlog_calc_unit_res(). Because xlog_grant_push_threshold() accesses contended atomic variables, it ends up consume 1-2% of total CPU time when transactions rates reach the million/s ballpark. I've currently replaced it with a fixed push threshold calculated at mount time and let the AIL calculate the LSN of the push target itself when it needs it. The result is a substantial reduction in the CPU usage of the hot xfs_log_reserve() path, which also happens to be the same hot path xlog_calc_unit_res() is called from... Cheers, Dave. -- Dave Chinner david@xxxxxxxxxxxxx