Re: Postgres as In-Memory Database?

Edson Richter <edsonrichter@xxxxxxxxxxx> · Wed, 20 Nov 2013 01:41:48 -0200



    Em 20/11/2013 01:30, Jeff Janes
      escreveu:

    
      On Tuesday, November 19, 2013, Edson Richter wrote:

      
          Em 19/11/2013 22:29, Jeff Janes escreveu:

          
                On Sun, Nov 17, 2013 at 4:46
                  PM, Edson Richter <edsonrichter@xxxxxxxxxxx>
                  wrote:

                   
                      Yes, those optimizations I was talking about:
                        having database server store transaction log in
                        high speed solid state disks and consider it
                        done while background thread will update data in
                        slower disks... 

                      
                      There is no reason to wait for fsync in slow disks
                      to guarantee consistency... If database server
                      crashes, then it just need to "redo" log
                      transactions from fast disk into slower data
                      storage and database server is ready to go (I
                      think this is Sybase/MS SQL strategy for years).

                    
                  Using a nonvolatile write cache for pg_xlog is
                    certainly possible and often done with PostgreSQL.
                     It is not important that the nonvolatile write
                    cache is fronting for SSD, fronting for HDD is fine
                    as the write cache turns the xlog into pure
                    sequential writes and HDD should not have a problem
                    keeping up.
                  

                  Cheers,
                  

                  Jeff
                
              
          Hum... I agree about the tecnology (SSD x HDD, etc) - but may
          be I misunderstood, but I have read that to keep always safe
          data, I must use fsync, and as result every transaction must
          wait for data to be written in disk before returning as
          success.

        
      A transaction must wait for the *xlog* to fsynced to "disk",
        but non-volatile write cache counts as disk.  It does not need
        to wait for the ordinary data files to be fsynced.  Checkpoints
        do need to wait for the ordinary data files to be fsynced, but
        the checkpoint process is a background process and it can wait
        for that without impeding user processes.
      

      If the checkpointer falls far enough behind, then things do
        start to fall apart, but I think that this is true of any
        system. So you can't just get get a BBU for the xlog and ignore
        all other IO entirely--eventually the other data does need to
        reach disk, and if it gets dirtied faster than it gets cleaned
        for a prolonged period then things will freeze up.
      

         By using the approach
          I've described you will have fsync (and data will be 100%
          safe), but transaction is considered success once written in
          the transaction log that is pure sequencial (and even
          pre-allocated space, without need to ask OS for new files or
          new space) - and also no need to wait for slow operations to
          write data in data pages.

          
          Am I wrong?

        
      No user-facing process needs to wait for the data pages to
        fsync, unless things have really gotten fouled up.
      

      Cheers,
      

      Jeff
    
    Ok, I still have one doubt (I'm learning a lot, tkx!):

    
    What happens, then, if data has been commited (so it is in xlog),
    but it is not in data pages yet, and it doesn't fit in memory
    buffers anymore: how would PostgreSQL query data without having to
    wait for checkpoint happend and data be available in data pages?

    
    Regards,

    
    Edson