Search Postgresql Archives

Re: Cluster seems broken after pg_basebackup

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 02/09/2015 08:34 AM, Guillaume Drolet wrote:

CCing list so the information stays in the thread.


2015-02-06 18:44 GMT-05:00 Adrian Klaver <adrian.klaver@xxxxxxxxxxx
<mailto:adrian.klaver@xxxxxxxxxxx>>:

    On 02/06/2015 09:17 AM, Guillaume Drolet wrote:

        Dear Adrian,

        Thanks for helping me. Sorry for the lack of details, I had said to
        myself I had to not forget to give these details but I hit the send
        button too fast. You know how it is...

        I added more info in your reply below.


             First some questions:

             1) What Postgres version?


        9.3


        Windows 7


             3) Where were you backing up from and to?


        Backing up from my only cluster (PGDATA) on disk E, to a backup
        directory on an other disk (F:) using this command:

        pg_basebackup -D "F:\\db_base_backup" -Fp -Xs  -R -P
        --label="basebackup20150205" --username=postgres

        What's weird is that I did some successful tests last week on
        the same
        system (backing up, archiving, recovering) using the same procedure.
        Only difference was the cluster, which was much smaller for testing
        purposes, but located at the same place (i.e. E:\data) and
        PostgresSQL
        installed in C:\Programs\...


             4) Which cluster does not start, the master or the child
        you created
             with pg_basebackup?



        The master. I haven't tried the child yet. But I saw that the
        message
        about role "208375PT$" is in logs from before the backup too.


        This is the local domain of my machine. I log onto my machine with a
        local admin account and using domain name 208375PT (I didn't set
        this
        part of my machine, the IT guys here at work did). The thing is:
        I don't
        understand why it's there in the log file??


    Not sure.

    What are you using for an authentication method for database login?


 At this moment, for my tests I use md5 for user 'postgres' and trust for
 user 'all'.





                 And after that, I went back to the log file and there's new
                 information
                 added:

                 2015-02-06 07:51:05 EST LOG:  processus serveur (PID
        184) a été
                 arrêté
                 par l'exception 0x80000004
                 2015-02-06 07:51:05 EST DÉTAIL:  Le processus qui a échoué
                 exécutait :
                 SELECT version();
                 2015-02-06 07:51:05 EST ASTUCE :  Voir le fichier
        d'en-tête C «
                 ntstatus.h » pour une description de la valeur
                       hexadécimale.


             Well according to here:

        https://msdn.microsoft.com/en-____us/library/cc704588.aspx
        <https://msdn.microsoft.com/en-__us/library/cc704588.aspx>
             <https://msdn.microsoft.com/__en-us/library/cc704588.aspx
        <https://msdn.microsoft.com/en-us/library/cc704588.aspx>>

             0x80000004
             STATUS_SINGLE_STEP


             {EXCEPTION} Single Step A single step or trace operation
        has just
             been completed.

             A developer is going to have explain what that means.




             My suspicion is you copied at least partly over a running
        server.


        How would that be possible? Using the pg_basebackup command I wrote
        above, it is clear that I wrote the backup on disk F and not E.


    I was just speculating, I would not put too much stock in it.



        While writing this post, I started my backup using:

        pg_ctl start -D "F:\db_basebackup"

        Similar stuff happened with pgAdmin and the log (message about
        symbolic
        link is related to my post from yesterday. I don't know if this
        could be
        involved in the current problem):

        2015-02-06 12:13:58 EST LOG:  le système de bases de données a été
        interrompu ; dernier lancement connu à 2015-02-05 14:30:34 EST
        2015-02-06 12:13:58 EST LOG:  création du répertoire manquant «
        pg_xlog/archive_status » pour les journaux de transactions
        2015-02-06 12:13:58 EST LOG:  la ré-exécution commence à
        24B/28000090
        2015-02-06 12:13:58 EST LOG:  n'a pas pu supprimer le lien
        symbolique «
        pg_tblspc/940585 » : No such file or directory
        2015-02-06 12:13:58 EST CONTEXTE :  xlog redo drop tablespace:
        940585
        2015-02-06 12:13:58 EST LOG:  état de restauration cohérent
        atteint à
        24B/290000B8
        2015-02-06 12:13:58 EST LOG:  ré-exécution faite à 24B/290000B8
        2015-02-06 12:13:58 EST LOG:  la dernière transaction a eu lieu à
        2015-02-05 09:06:04.892-05 (moment de la journalisation)
        2015-02-06 12:13:59 EST LOG:  le système de bases de données est
        prêt
        pour accepter les connexions
        2015-02-06 12:13:59 EST LOG:  lancement du processus autovacuum
        2015-02-06 12:14:42 EST LOG:  processus serveur (PID 1784) a été
        arrêté
        par l'exception 0x80000004
        2015-02-06 12:14:42 EST DÉTAIL:  Le processus qui a échoué
        exécutait :
        SELECT version();
        2015-02-06 12:14:42 EST ASTUCE :  Voir le fichier d'en-tête C «
        ntstatus.h » pour une description de la valeur
              hexadécimale.
        2015-02-06 12:14:42 EST LOG:  arrêt des autres processus serveur
        actifs
        2015-02-06 12:14:42 EST ATTENTION:  arrêt de la connexion à cause de
        l'arrêt brutal d'un autre processus serveur
        2015-02-06 12:14:42 EST DÉTAIL:  Le postmaster a commandé à ce
        processus
        serveur d'annuler la transaction
              courante et de quitter car un autre processus serveur a quitté
        anormalement
              et qu'il existe probablement de la mémoire partagée corrompue.
        2015-02-06 12:14:42 EST ASTUCE :  Dans un moment, vous devriez être
        capable de vous reconnecter à la base de
              données et de relancer votre commande.
        2015-02-06 12:14:42 EST LOG:  tous les processus serveur se sont
        arrêtés, réinitialisation


        Any ideas where to go from here?


    In both cases the database got to the point below, which would seem
    to indicate everything was alright.

    2015-02-06 7:11:38 ET LOG: the re-execution is not required
    2015-02-06 7:11:38 ET LOG: the database system is ready for
    accept connections

    Also from what I can see the server crashed at this point:

    2015-02-06 12:13:59 LOG IS: launch autovacuum processes
    2015-02-06 12:14:42 EST LOG: server process (PID 1784) was arrested
    by the exception 0x80000004


    Now 0x80000004 is supposed to mean:

    STATUS_SINGLE_STEP


    {EXCEPTION} Single Step A single step or trace operation has just
    been completed.

    Some digging indicates this is the result of debugger command. Have
    no idea how that would invoked in Postgres running production code.
    This leads to my default question when I see unexplained behavior on
    a Windows machine; do you have anti-virus machine running against
    the drives?



 Yes I do and I'm not allowed to turn it off (I don't have such
 privileges). But the anti-virus software is running on my other machine
 (same setup) and I've never had such problems. Even on this machine
 that's giving me problems, I spent the two last weeks making tests with
 point-in-time-recovery and everything went fine.




        Thanks a lot again.


                 Thanks a lot for helping! Guillaume



             --
             Adrian Klaver
        adrian.klaver@xxxxxxxxxxx <mailto:adrian.klaver@xxxxxxxxxxx>
        <mailto:adrian.klaver@aklaver.__com
        <mailto:adrian.klaver@xxxxxxxxxxx>>




    --
    Adrian Klaver
    adrian.klaver@xxxxxxxxxxx <mailto:adrian.klaver@xxxxxxxxxxx>




--
Adrian Klaver
adrian.klaver@xxxxxxxxxxx


--
Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general




[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [Postgresql Jobs]     [Postgresql Admin]     [Postgresql Performance]     [Linux Clusters]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Books]     [PHP Databases]     [Postgresql & PHP]     [Yosemite]
  Powered by Linux