On 20/04/18 16:52, Jason Andryuk wrote: > On Fri, Apr 20, 2018 at 11:42 AM, Jan Beulich <JBeulich@xxxxxxxx> wrote: >>>>> On 20.04.18 at 17:25, <andrew.cooper3@xxxxxxxxxx> wrote: >>> On 20/04/18 16:20, Jason Andryuk wrote: >>>> Adding xen-devel and the Linux Xen maintainers. >>>> >>>> Summary: Some Xen users (and maybe others) are hitting a BUG in >>>> __radix_tree_lookup() under do_swap_page() - example backtrace is >>>> provided at the end. Matthew Wilcox provided a band-aid patch that >>>> prints errors like the following instead of triggering the bug. >>>> >>>> Skylake 32bit PAE Dom0: >>>> Bad swp_entry: 80000000 >>>> mm/swap_state.c:683: bad pte d3a39f1c(8000000400000000) >>>> >>>> Ivy Bridge 32bit PAE Dom0: >>>> Bad swp_entry: 40000000 >>>> mm/swap_state.c:683: bad pte d3a05f1c(8000000200000000) >>>> >>>> Other 32bit DomU: >>>> Bad swp_entry: 4000000 >>>> mm/swap_state.c:683: bad pte e2187f30(8000000200000000) >>>> >>>> Other 32bit: >>>> Bad swp_entry: 2000000 >>>> mm/swap_state.c:683: bad pte ef3a3f38(8000000100000000) >>>> >>>> The Linux bugzilla has more info >>>> https://bugzilla.kernel.org/show_bug.cgi?id=198497 >>>> >>>> This may not be exclusive to Xen Linux, but most of the reports are on >>>> Xen. Matthew wonders if Xen might be stepping on the upper bits of a >>>> pte. >>> Yes - Xen does use the upper bits of a PTE, but only 1 in release >>> builds, and a second in debug builds. I don't understand where you're >>> getting the 3rd bit in there. >> The former supposedly is _PAGE_GUEST_KERNEL, which we use for 64-bit >> guests only. Above talk is of 32-bit guests only. >> >> In addition both this and _PAGE_GNTTAB are used on present PTEs only, >> while above talk is about swap entries. > This hits a BUG going through do_swap_page, but it seems like users > don't think they are actually using swap at the time. One reporter > didn't have any swap configured. Some of this information was further > down in my original message. > > I'm wondering if somehow we have a PTE that should be empty and should > be lazily filled. For some reason, the entry has some bits set and is > causing the trouble. Would Xen mess with the PTEs in that case? Any PTE with the present bit clear will be accepted and used unmodified. That said, I believe there is some batching of updates for efficiency reasons in the PVops layer of the kernel, which might end up causing a disconnect between what the swap system things, and what the actual PTEs show when read. ~Andrew