On Wed, Mar 26, 2014 at 4:11 PM, Andy Lutomirski <luto@xxxxxxxxxxxxxx> wrote: > On Wed, Mar 26, 2014 at 3:35 PM, David Lang <david@xxxxxxx> wrote: >> On Wed, 26 Mar 2014, Andy Lutomirski wrote: >> >>>>> I'm not sure I understand the request queue stuff, but here's an idea. >>>>> The block core contains this little bit of code: >>>> >>>> >>>> I haven't read enough of the code yet, to comment intelligently ;) >>> >>> >>> My little patch doesn't seem to help. I'm either changing the wrong >>> piece of code entirely or I'm penalizing readers and writers too much. >>> >>> Hopefully some real block layer people can comment as to whether a >>> refinement of this idea could work. The behavior I want is for >>> writeback to be limited to using a smallish fraction of the total >>> request queue size -- I think that writeback should be able to enqueue >>> enough requests to get decent sorting performance but not enough >>> requests to prevent the io scheduler from doing a good job on >>> non-writeback I/O. >> >> >> The thing is that if there are no reads that are waiting, why not use every >> bit of disk I/O available to write? If you can do that reliably with only >> using part of the queue, fine, but aren't you getting fairly close to just >> having separate queues for reading and writing with such a restriction? >> > > Hmm. > > I wonder what the actual effect of queue length is on throughput. I > suspect that using half the queue gives you well over half the > throughput as long as the queue isn't tiny. > > I'm not so sure I'd go so far as having separate reader and writer > queues -- I think that small synchronous writes should also not get > stuck behind large writeback storms, but maybe that's something that > can be a secondary goal. That being said, separate reader and writer > queues might solve the immediate problem. It won't help for the case > where a small fsync blocks behind writeback, though, and that seems to > be a very common cause of Firefox freezing on my system. > > Is there an easy way to do a proof-of-concept? It would be great if > there was a ten-line patch that implemented something like this > correctly enough to see if it helps. I don't think I'm the right > person to do it, because my knowledge of the block layer code is > essentially nil. I think it's at least a bit more subtle than this. cfq distinguishes SYNC and ASYNC, but very large fsyncs are presumably SYNC. Deadline pays no attention to rw flags. Anyway, it seems like there's basically nothing prioritizing what happens when the number of requests exceeds the congestion thresholds. I'd happily bet a beverage* that Postgres's slow requests are spending an excessive amount of time waiting to get into the queue in the first place. * Since I'm back home now, any actual beverage transaction will be rather delayed. -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html