Re: [HACKERS] Re: Faster CREATE DATABASE by delaying fsync (was 8.4.1 ubuntu karmic slow createdb)

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tuesday 02 February 2010 18:36:12 Robert Haas wrote:
> On Fri, Jan 29, 2010 at 1:56 PM, Greg Stark <gsstark@xxxxxxx> wrote:
> > On Tue, Jan 19, 2010 at 3:25 PM, Tom Lane <tgl@xxxxxxxxxxxxx> wrote:
> >> That function *seriously* needs documentation, in particular the fact
> >> that it's a no-op on machines without the right kernel call.  The name
> >> you've chosen is very bad for those semantics.  I'd pick something
> >> else myself.  Maybe "pg_start_data_flush" or something like that?
> > 
> > I would like to make one token argument in favour of the name I
> > picked. If it doesn't convince I'll change it since we can always
> > revisit the API down the road.
> > 
> > I envision having two function calls, pg_fsync_start() and
> > pg_fsync_finish(). The latter will wait until the data synced in the
> > first call is actually synced. The fall-back if there's no
> > implementation of this would be for fsync_start() to be a noop (or
> > something unreliable like posix_fadvise) and fsync_finish() to just be
> > a regular fsync.
> > 
> > I think we can accomplish this with sync_file_range() but I need to
> > read up on how it actually works a bit more. In this case it doesn't
> > make a difference since when we call fsync_finish() it's going to be
> > for the entire file and nothing else will have been writing to these
> > files. But for wal writing and checkpointing it might have very
> > different performance characteristics.
> > 
> > The big objection to this is that then we don't really have an api for
> > FADV_DONT_NEED which is more about cache policy than about syncing to
> > disk. So for example a sequential scan might want to indicate that it
> > isn't planning on reading the buffers it's churning through but
> > doesn't want to force them to be written sooner than otherwise and is
> > never going to call fsync_finish().
> 
> I took a look at this patch today and I agree with Tom that
> pg_fsync_start() is a very confusing name.  I don't know what the
> right name is, but this doesn't fsync so I don't think it shuld have
> fsync in the name.  Maybe something like pg_advise_abandon() or
> pg_abandon_cache().  The current name is really wishful thinking:
> you're hoping that it will make the kernel start the fsync, but it
> might not.  I think pg_start_data_flush() is similarly optimistic.
What about: pg_fsync_prepare(). That gives the reason why were doing that and 
doesnt promise that it is actually doing an fsync.
I dislike really having "cache" in the name, because the primary aim is not to 
discard the cache...

Andres

-- 
Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

[Postgresql General]     [Postgresql PHP]     [PHP Users]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Books]     [PHP Databases]     [Yosemite]

  Powered by Linux