On Wed, Sep 11, 2024 at 7:28 PM David Wang <00107082@xxxxxxx> wrote: > > At 2024-07-02 05:58:50, "Kent Overstreet" <kent.overstreet@xxxxxxxxx> wrote: > >On Mon, Jul 01, 2024 at 10:23:32AM GMT, David Wang wrote: > >> HI Suren, > >> > >> At 2024-07-01 03:33:14, "Suren Baghdasaryan" <surenb@xxxxxxxxxx> wrote: > >> >On Mon, Jun 17, 2024 at 8:33 AM David Wang <00107082@xxxxxxx> wrote: > >> >> > >> >> Accumulated call counter can be used to evaluate rate > >> >> of memory allocation via delta(counters)/delta(time). > >> >> This metrics can help analysis performance behaviours, > >> >> e.g. tuning cache size, etc. > >> > > >> >Sorry for the delay, David. > >> >IIUC with this counter you can identify the number of allocations ever > >> >made from a specific code location. Could you please clarify the usage > >> >a bit more? Is the goal to see which locations are the most active and > >> >the rate at which allocations are made there? How will that > >> >information be used? > >> > >> Cumulative counters can be sampled with timestamp, say at T1, a monitoring tool got a sample value V1, > >> then after sampling interval, at T2, got a sample value V2. Then the average rate of allocation can be evaluated > >> via (V2-V1)/(T2-T1). (The accuracy depends on sampling interval) > >> > >> This information "may" help identify where the memory allocation is unnecessary frequent, > >> and gain some better performance by making less memory allocation . > >> The performance "gain" is just a guess, I do not have a valid example. > > > >Easier to just run perf... > > Hi, > > To Kent: > It is strangely odd to reply to this when I was trying to debug a performance issue for bcachefs :) > > Yes it is true that performance bottleneck could be identified by perf tools, but normally perf > is not continously running (well, there are some continous profiling projects out there). > And also, memory allocation normally is not the biggest bottleneck, > its impact may not easily picked up by perf. > > Well, in the case of https://lore.kernel.org/lkml/20240906154354.61915-1-00107082@xxxxxxx/, > the memory allocation is picked up by perf tools though. > But with this patch, it is easier to spot that memory allocations behavior are quite different: > When performance were bad, the average rate for > "fs/bcachefs/io_write.c:113 func:__bio_alloc_page_pool" was 400k/s, > while when performance were good, rate was only less than 200/s. > > (I have a sample tool collecting /proc/allocinfo, and the data is stored in prometheus, > the rate is calculated and plot via prometheus statement: > irate(mem_profiling_count_total{file=~"fs/bcachefs.*", func="__bio_alloc_page_pool"}[5m])) > > Hope this could be a valid example demonstrating the usefulness of accumulative counters > of memory allocation for performance issues. Hi David, I agree with Kent that this feature should be behind a kconfig flag. We don't want to impose the overhead to the users who do not need this feature. Thanks, Suren. > > > Thanks > David > > >