On Wed, Oct 6, 2010 at 5:31 PM, Ivan Voras <ivoras@xxxxxxxxxxx> wrote: > On 10/04/10 20:49, Josh Berkus wrote: > >>> The other major bottleneck they ran into was a kernel one: reading from >>> the heap file requires a couple lseek operations, and Linux acquires a >>> mutex on the inode to do that. The proper place to fix this is >>> certainly in the kernel but it may be possible to work around in >>> Postgres. >> >> Or we could complain to Kernel.org. ÂThey've been fairly responsive in >> the past. ÂToo bad this didn't get posted earlier; I just got back from >> LinuxCon. >> >> So you know someone who can speak technically to this issue? I can put >> them in touch with the Linux geeks in charge of that part of the kernel >> code. > > Hmmm... lseek? As in "lseek() then read() or write()" idiom? It AFAIK > cannot be fixed since you're modifying the global "strean position" > variable and something has got to lock that. > > OTOH, pread() / pwrite() don't have to do that. While lseek is very "cheap" it is like any other system call in that when you multiple "cheap" times "a jillion" you end up with "notable" or even "lots". I've personally seen notable performance improvements by switching to pread/pwrite instead of lseek+{read,write}. For platforms that don't implement pread or pwrite, wrapper calls are trivial to produce. One less system call is, in this case, 50% fewer. -- Jon -- Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance