On Fri, Nov 27, 2020 at 5:40 PM Vlastimil Babka <vbabka@xxxxxxx> wrote: > > On 11/25/20 3:15 PM, Mohamed Alzayat wrote: > > Hi Everyone, > > > > I have noticed a change in the synchrony of updating the soft-dirty > > bits in recent kernel versions (5.6+). More precisely, up to kernel > > v5.5, the soft-dirty bits as parsed from /proc/pid/pagemap accurately > > capture the dirtied pages. Recently, I started testing on kernels v5.6 > > - v5.9, and I noticed that the soft-dirty bits are not immediately > > updated. > > > > I have prepared a short test that repeatedly causes at least one > > memory page to be dirtied, then scans /proc/pid/pagemap counting the > > soft-dirty bits. The test fails if this count is zero. In my > > observation, this test fails once in every 10-20 trials. The test > > defaults to 100 trials and can be found at > > https://gitlab.mpi-sws.org/-/snippets/1696 > > > > Is this non-synchronous propagation of soft dirty bits intended? If > > AFAIK, not. The tracking is done by write-protecting the pages to cause a page > fault, so it should be quite synchronous update of page table entries, and > reading pagemap is a page table walk of those very entries. > > But as you have the test, it should be possible to git bisect it? Just do enough > trials to be sure enough that no fail means indeed a "good" kernel. Thanks for confirming, Vlastimil! The first bad commit is: 0758cd8304942292e95a0f750c374533db378b32 asm-generic/tlb: avoid potential double flush https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=0758cd8304942292e95a0f750c374533db378b32 Reverting this commit solves the problem, but this might not be the right way of fixing it. > > > yes, is there a way to force the soft-dirty bits to be propagated to > > the page map entries immediately, or is there an alternative interface > > that has the synchronous behavior? > > > > Thanks in advance, > > Mohamed Alzayat > > > >