On 14/02/2018 03:04, Josh Poimboeuf wrote: > On Sun, Feb 11, 2018 at 02:39:41PM +0100, Marc Haber wrote: >> Hi, >> >> after in total nine weeks of bisecting, broken filesystems, service >> outages (thankfully on unportant systems), 4.15 seems to have fixed the >> issue. After going to 4.15, the crashes never happened again. >> >> They have, however, happened with each and every 4.14 release I tried, >> which I stopped doing with 4.14.15 on Jan 28. >> >> This means, for me, that the issue is fixed and that I have just wasted >> nine weeks of time. >> >> For you, this means that you have a crippling, data-eating issue in the >> current long-term releae kernel. I do sincerely hope that I never have >> to lay my eye on any 4.14 kernel and hope that no major distribution >> will release with this version. > > I saw something similar today, also in kvm_async_pf_task_wait(). I had > -tip in the guest (based on 4.16.0-rc1) and Fedora > 4.14.16-300.fc27.x86_64 on the host. Hi Josh/Marc, this is fixed by commit 2a266f23550be997d783f27e704b9b40c4010292 Author: Haozhong Zhang <haozhong.zhang@xxxxxxxxx> Date: Wed Jan 10 21:44:42 2018 +0800 KVM MMU: check pending exception before injecting APF For example, when two APF's for page ready happen after one exit and the first one becomes pending, the second one will result in #DF. Instead, just handle the second page fault synchronously. Reported-by: Ross Zwisler <zwisler@xxxxxxxxx> Message-ID: <CAOxpaSUBf8QoOZQ1p4KfUp0jq76OKfGY4Uxs-Gg8ngReD99xww@xxxxxxxxxxxxxx> Reported-by: Alec Blayne <ab@xxxxxxxxx> Signed-off-by: Haozhong Zhang <haozhong.zhang@xxxxxxxxx> Signed-off-by: Paolo Bonzini <pbonzini@xxxxxxxxxx> and it will be in 4.14.20. Unfortunately I only heard about this issue last week. Thanks, Paolo