Hi. On Sat, Mar 12, 2022 at 07:07:15PM +0000, Shakeel Butt <shakeelb@xxxxxxxxxx> wrote: > So, I will focus on the error rate in this email. (OK, I'll stick to error estimate (for long-term) in this message and will send another about the current patch.) > [...] > > > The benefit this was traded for was the greater accuracy, the possible > > error is: > > - before > > - O(nr_cpus * nr_cgroups(subtree) * MEMCG_CHARGE_BATCH) (1) > > Please note that (1) is the possible error for each stat item and > without any time bound. I agree (forgot to highlight this can stuck forever). > > > - after > > O(nr_cpus * MEMCG_CHARGE_BATCH) // sync. flush > > The above is across all the stat items. Can it be used to argue about the error? E.g. nr_cpus * MEMCG_CHARGE_BATCH / nr_counters looks appealing but that's IMO too optimistic. The individual item updates are correlated so in practice a single item would see a lower error than my first relation but without delving too much into correlations the upper bound is nr_counters independent. > I don't get the reason of breaking 'cr' into individual stat item or > counter. What is the benefit? We want to keep the error rate decoupled > from the number of counters (or stat items). It's just a model, it should capture that every stat item (change) contributes to the common error estimate. (So it moves more towards the nr_cpus * MEMCG_CHARGE_BATCH / nr_counters per-item error (but here we're asking about processing time.)) [...] > My main reason behind trying NR_MEMCG_EVENTS was to reduce flush_work by > reducing nr_counters and I don't think nr_counters should have an impact > on Δt. The higher number of items is changing, the sooner they accumulate the target error, no? (Δt is not the periodic flush period, it's variable time between two sync flushes.) Michal