On Wed, Jun 28, 2017 at 12:57:59PM -0400, Rik van Riel wrote: > On Mon, 2017-06-26 at 11:11 -0400, Jeff Moyer wrote: > > Lukas Czerner <lczerner@xxxxxxxxxx> writes: > > > > > > The thing we do is a best effort thing that more or less > > > > guarantees that if > > > > you do say buffered IO and direct IO after that, it will work > > > > reasonably. > > > > However if direct and buffered IO can race, bad luck for your > > > > data. I don't > > > > think we want to sacrifice any performance of AIO DIO (and > > > > offloading of > > > > direct IO completion to a workqueue so that we can do > > > > invalidation costs > > > > noticeable mount of performance) for supporting such usecase. > > > > > > What Jeff proposed would sacrifice performance for the case where > > > AIO > > > DIO write does race with buffered IO - the situation we agree is > > > not ideal > > > and should be avoided anyway. For the rest of AIO DIO this should > > > have no > > > effect right ? If true, I'd say this is a good effort to make sure > > > we do > > > not have disparity between page cache and disk. > > > > Exactly. Jan, are you concerned about impacting performance for > > mixed > > buffered I/O and direct writes? If so, we could look into > > restricting > > the process context switch further, to just overlapping buffered and > > direct I/O (assuming there are no locking issues). > > > > Alternatively, since we already know this is racy, we don't actually > > have to defer I/O completion to process context. We could just > > complete > > the I/O as we normally would, but also queue up an > > invalidate_inode_pages2_range work item. It will be asynchronous, > > but > > this is best effort, anyway. > > > > As Eric mentioned, the thing that bothers me is that we have invalid > > data lingering in the page cache indefinitely. > > Given that the requirement is that the page cache > gets invalidated after IO completion, would it be > possible to defer only the page cache invalidation > to task context, and handle the rest of the IO > completion in interrupt context? Hi, if I am reading it correctly that's basically how it works now for the IO that has defer_completion set (filesystems set this to do extent conversion at the completion). We'd use the same path here for the invalidation. -Lukas > > -- > All rights reversed