On Thu, Jun 13, 2019 at 04:30:36PM -1000, Linus Torvalds wrote: > On Thu, Jun 13, 2019 at 1:56 PM Dave Chinner <david@xxxxxxxxxxxxx> wrote: > > > > That said, the page cache is still far, far slower than direct IO, > > Bullshit, Dave. > > You've made that claim before, and it's been complete bullshit before > too, and I've called you out on it then too. Yes, your last run of insulting rants on this topic resulted in me pointing out your CoC violations because you were unable to listen or discuss the subject matter in a civil manner. And you've started right where you left off last time.... > Why do you continue to make this obviously garbage argument? > > The key word in the "page cache" name is "cache". > > Caches work, Dave. Yes, they do, I see plenty of cases where the page cache works just fine because it is still faster than most storage. But that's _not what I said_. Indeed, you haven't even bothered to ask me to clarify what I was refering to in the statement you quoted. IOWs, you've taken _one single statement_ I made from a huge email about complexities in dealing with IO concurency, the page cache and architectural flaws n the existing code, quoted it out of context, fabricated a completely new context and started ranting about how I know nothing about how caches or the page cache work. Not very professional but, unfortunately, an entirely predictable and _expected_ response. Linus, nobody can talk about direct IO without you screaming and tossing all your toys out of the crib. If you can't be civil or you find yourself writing a some condescending "caching 101" explanation to someone who has spent the last 15+ years working with filesystems and caches, then you're far better off not saying anything. --- So, in the interests of further _civil_ discussion, let me clarify my statement for you: for a highly concurrent application that is crunching through bulk data on large files on high throughput storage, the page cache is still far, far slower than direct IO. Which comes back to this statement you made: > Is direct IO faster when you *know* it's not cached, and shouldn't > be cached? Sure. But that/s actually quite rare. This is where I think you get the wrong end of the stick, Linus. The world I work in has a significant proportion of applications where the data set is too large to be cached effectively or is better cached by the application than the kernel. IOWs, data being cached efficiently by the page cache is the exception rather than the rule. Hence, they use direct IO because it is faster than the page cache. This is common in applications like major enterprise databases, HPC apps, data mining/analysis applications, etc. and there's an awful lot of the world that runs on these apps.... -Dave. -- Dave Chinner david@xxxxxxxxxxxxx