On Tue, Jun 25, 2024 at 3:35 PM Christoph Lameter (Ampere) <cl@xxxxxxxxx> wrote: > > On Tue, 25 Jun 2024, Yosry Ahmed wrote: > > >> In my reply above, I am not arguing to go back to the older > >> stats_flush_ongoing situation. Rather I am discussing what should be the > >> best eventual solution. From the vmstats infra, we can learn that > >> frequent async flushes along with no sync flush, users are fine with the > >> 'non-determinism'. Of course cgroup stats are different from vmstats > >> i.e. are hierarchical but I think we can try out this approach and see > >> if this works or not. > > > > If we do not do sync flushing, then the same problem that happened > > with stats_flush_ongoing could occur again, right? Userspace could > > read the stats after an event, and get a snapshot of the system before > > that event. > > > > Perhaps this is fine for vmstats if it has always been like that (I > > have no idea), or if no users make assumptions about this. But for > > cgroup stats, we have use cases that rely on this behavior. > > vmstat updates are triggered initially as needed by the shepherd task and > there is no requirement that this is triggered simultaenously. We > could actually randomize the intervals in vmstat_update() a bit if this > will help. The problem is that for cgroup stats, the behavior has been that a userspace read will trigger a flush (i.e. propagating updates). We have use cases that depend on this. If we switch to the vmstat model where updates are triggered independently from user reads, it constitutes a behavioral change.