Re: Potential bug in soft-dirty bits (with test case)

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 11/30/20 11:37 AM, Mohamed Alzayat wrote:
On Fri, Nov 27, 2020 at 5:40 PM Vlastimil Babka <vbabka@xxxxxxx> wrote:

On 11/25/20 3:15 PM, Mohamed Alzayat wrote:
> Hi Everyone,
>
> I have noticed a change in the synchrony of updating the soft-dirty
> bits in recent kernel versions (5.6+). More precisely, up to kernel
> v5.5, the soft-dirty bits as parsed from /proc/pid/pagemap accurately
> capture the dirtied pages. Recently, I started testing on kernels v5.6
> - v5.9, and I noticed that the soft-dirty bits are not immediately
> updated.
>
> I have prepared a short test that repeatedly causes at least one
> memory page to be dirtied, then scans /proc/pid/pagemap counting the
> soft-dirty bits. The test fails if this count is zero. In my
> observation, this test fails once in every 10-20 trials. The test
> defaults to 100 trials and can be found at
> https://gitlab.mpi-sws.org/-/snippets/1696
>
> Is this non-synchronous propagation of soft dirty bits intended? If

AFAIK, not. The tracking is done by write-protecting the pages to cause a page
fault, so it should be quite synchronous update of page table entries, and
reading pagemap is a page table walk of those very entries.

But as you have the test, it should be possible to git bisect it? Just do enough
trials to be sure enough that no fail means indeed a "good" kernel.

Thanks for confirming, Vlastimil!

The first bad commit is: 0758cd8304942292e95a0f750c374533db378b32
asm-generic/tlb: avoid potential double flush
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=0758cd8304942292e95a0f750c374533db378b32

Reverting this commit solves the problem, but this might not be the
right way of fixing it.

Thanks for bisecting! Let's CC people involved in that commit. All important should be in the quoted conversation above.

Vlastimil



> yes, is there a way to force the soft-dirty bits to be propagated to
> the page map entries immediately, or is there an alternative interface
> that has the synchronous behavior?
>
> Thanks in advance,
> Mohamed Alzayat
>








[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux