On 20/04/18 21:20, Boris Ostrovsky wrote: > On 04/20/2018 12:02 PM, Jan Beulich wrote: >>>>> On 20.04.18 at 17:52, <jandryuk@xxxxxxxxx> wrote: >>> On Fri, Apr 20, 2018 at 11:42 AM, Jan Beulich <JBeulich@xxxxxxxx> wrote: >>>>>>> On 20.04.18 at 17:25, <andrew.cooper3@xxxxxxxxxx> wrote: >>>>> On 20/04/18 16:20, Jason Andryuk wrote: >>>>>> Adding xen-devel and the Linux Xen maintainers. >>>>>> >>>>>> Summary: Some Xen users (and maybe others) are hitting a BUG in >>>>>> __radix_tree_lookup() under do_swap_page() - example backtrace is >>>>>> provided at the end. Matthew Wilcox provided a band-aid patch that >>>>>> prints errors like the following instead of triggering the bug. >>>>>> >>>>>> Skylake 32bit PAE Dom0: >>>>>> Bad swp_entry: 80000000 >>>>>> mm/swap_state.c:683: bad pte d3a39f1c(8000000400000000) >>>>>> >>>>>> Ivy Bridge 32bit PAE Dom0: >>>>>> Bad swp_entry: 40000000 >>>>>> mm/swap_state.c:683: bad pte d3a05f1c(8000000200000000) >>>>>> >>>>>> Other 32bit DomU: >>>>>> Bad swp_entry: 4000000 >>>>>> mm/swap_state.c:683: bad pte e2187f30(8000000200000000) >>>>>> >>>>>> Other 32bit: >>>>>> Bad swp_entry: 2000000 >>>>>> mm/swap_state.c:683: bad pte ef3a3f38(8000000100000000) >>>>>> >>>>>> The Linux bugzilla has more info >>>>>> https://bugzilla.kernel.org/show_bug.cgi?id=198497 >>>>>> >>>>>> This may not be exclusive to Xen Linux, but most of the reports are on >>>>>> Xen. Matthew wonders if Xen might be stepping on the upper bits of a >>>>>> pte. >>>>> Yes - Xen does use the upper bits of a PTE, but only 1 in release >>>>> builds, and a second in debug builds. I don't understand where you're >>>>> getting the 3rd bit in there. >>>> The former supposedly is _PAGE_GUEST_KERNEL, which we use for 64-bit >>>> guests only. Above talk is of 32-bit guests only. >>>> >>>> In addition both this and _PAGE_GNTTAB are used on present PTEs only, >>>> while above talk is about swap entries. >>> This hits a BUG going through do_swap_page, but it seems like users >>> don't think they are actually using swap at the time. One reporter >>> didn't have any swap configured. Some of this information was further >>> down in my original message. >>> >>> I'm wondering if somehow we have a PTE that should be empty and should >>> be lazily filled. For some reason, the entry has some bits set and is >>> causing the trouble. Would Xen mess with the PTEs in that case? >> As said in my previous reply - both of the bits Andrew has mentioned can >> only ever be set when the present bit is also set (which doesn't appear to >> be the case here). The set bits above are actually in the range of bits >> designated to the address, which Xen wouldn't ever play with. > > > The bug description starts with: "On a Xen VM running as pvh" > > So is this a PV or a PVH guest? The stack backtrace suggests PV. Juergen