On Wed, Sep 12, 2018 at 02:11:30PM +0200, Jan Kara wrote: > > Yes, I guess you're speaking about the one Chris Mason mentioned [1]. > Essentially it's a priority inversion where jbd2 thread gets blocked behind > writeback done on behalf of a heavily restricted process. It actually is > not related to dirty throttling or anything like that. And the solution for > this priority inversion is to use unwritten extents for writeback > unconditionally as I wrote in that thread. The core of this is implemented > and hidden behind dioread_nolock mount option but it needs some serious > polishing work and testing... > > [1] https://marc.info/?l=linux-fsdevel&m=151688776319077 I've actually be considering making dioread_nolock the default when page_size == block_size. Arguments in favor: 1) Improves AIO latency in some circumstances 2) Improves parallel DIO read performance 3) Should address the block-cg throttling priority inversion problem Arguments against: 1) Hasn't seen much usage outside of Google (where it makes a big difference for fast flash workloads; see (1) and (2) above) 2) Dioread_nolock only works when page_size == block_size; so this implies we would be using a different codepath depending on the block size. 3) generic/500 (dm-thin ENOSPC hitter with concurrent discards) fails with dioread_nolock, but not in the 4k workload Liu, can you try out mount -o dioread_nolock and see if this address your problem, if so, maybe this is the development cycle where we finally change the default. - Ted