On Mon, 2017-01-23 at 12:25 -0500, Theodore Ts'o wrote: > On Mon, Jan 23, 2017 at 07:10:00AM -0500, Jeff Layton wrote: > > > > Well, except for QEMU/KVM, Kevin has already confirmed that using > > > > Direct I/O is a completely viable solution. (And I'll add it solves a > > > > bunch of other problems, including page cache efficiency....) > > > > Sure, O_DIRECT does make this simpler (though it's not always the most > > efficient way to do I/O). I'm more interested in whether we can improve > > the error handling with buffered I/O. > > I just want to make sure we're designing a solution that will actually > be _used_, because it is a good fit for at least one real-world use > case. > Exactly. Asking how the QEMU folks would like to be able to interact with the kernel is not the same as promising to implement said solution. Still, I think it's a valid question, and I'll pose it in terms of NFS though I think the semantics apply to other situations as well. I'm mostly just asking to get a better idea of what the KVM folks would really like to have happen in this situation. I don't think they want to error out on every network blip, but in the face of a hung mount that isn't making progress in writeback, what would they like to be able to do to resolve it? For instance, With NFS you can generally send a SIGKILL to the process to make it abandon O_DIRECT writes. But, tasks accessing NFS mounts still seem to get stuck in buffered writeback if the server goes away, generally waiting on the page bits to clear in uninterruptible sleeps. Would better handling of SIGKILL when waiting on buffered writeback be what QEMU devs would like? That seems like a reasonable thing to consider. > Is QEMU/KVM using volumes that are stored over NFS really used in the real world? Especially one where you want a huge amount of reliability and recovery after some kind network failure? If we are talking about customers who are going to suspend the VM and restart it on another server, that presumes a fairly large installation size and enough servers that would they *really* want to use a single point of failure such as an NFS filer? Even if it was a proprietary > purpose-built NFS filer? Why wouldn't they be using RADOS and Cephinstead, for example? Nothing specific about NFS in what I was asking. I think cephfs has similar behavior in the face of the client not being able to reach any of its MDS', for instance. -- Jeff Layton <jlayton@xxxxxxxxxxxxxxx> -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html