Luke Lonergan wrote: > For plans that qualify with the above conditions, the executor will issue > blocking calls to lseek(), which will translate to a single disk actuator > moving to the needed location in seek_time, approximately 8ms. I doubt it's actually the lseeks, but the reads/writes after the lseeks that block. > If we implement AIO and allow for multiple pending I/Os used to prefetch > groups of qualifying tuples, basically a form of random readahead, we can > improve the throughput for any given query by taking advantage of multiple > disk actuators. Rather than jumping to AIO, which is a huge change, I think we could get much of the benefit by using posix_fadvise(WILLNEED) in strategic places to tell the OS what pages we're going to need in the near future. If the OS has implemented that properly, it should schedule I/Os for the requested pages ahead of time. That would require very little change to PostgreSQL code, and could simply be #ifdef'd away on platforms that don't support posix_fadvise. > Note that > the same approach would also work to speed sequential access by overlapping > compute and I/O. Yes, though the OS should already doing read ahead for us. How efficient it is is another question. posix_fadvise(SEQUENTIAL) could be used to give a hint on that as well. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com ---------------------------(end of broadcast)--------------------------- TIP 6: explain analyze is your friend