Re: pglogical vs. built-in logical replication in pg-10

Achilleas Mantzios <achill@xxxxxxxxxxxxxxxxxxxxx> · Thu, 22 Jun 2017 18:10:40 +0300



    On 22/06/2017 17:46, Andreas Joseph
      Krogh wrote:

    
      På torsdag 22. juni 2017 kl. 15:25:20, skrev Achilleas
        Mantzios <achill@xxxxxxxxxxxxxxxxxxxxx>:
      
        On 22/06/2017 13:38, Andreas Joseph
          Krogh wrote:
        
          På torsdag 22. juni 2017 kl. 11:43:02, skrev Achilleas
            Mantzios <achill@xxxxxxxxxxxxxxxxxxxxx>:
          
            On 22/06/2017 11:21, Andreas
              Joseph Krogh wrote:
            
              Hi.
               
              1. Why should one prefer built-in logical replication
                in pg-10 to pglogical, does it do anything pglogical
                doesn't?
              It seems pglogical is more feature-rich...
              2. As I understand built-in logical replication in
                pg-10 doesn't support large-objects, which we use a lot.
                Does pglogical replicate large objects? I cannot find
                any notes about large-objects under "Limitations and
                Restrictions": https://www.2ndquadrant.com/en/resources/pglogical/pglogical-docs/
            
            You may do a simple test, create a table with a largeobject
            and try to read the logical stream, if it cannot represent
            the lo_import, lo_open, lowrite, lo_close (and I 'd bet they
            can't be encoded) then neither pglogical (being based on the
            same logical decoding technology) will support them

             
          The point of email-lists like this is that one may share
            knowledge so one doesn't have to test everything one self,
            and can build on knowledge from others. I'm looking for an
            answer from someone who's not betting, but knows.
        
        I gave you enough knowledge already. Here's some more :

        - go and install 10

        - create a table containing one col with type oid (large object)
        and one bytea

        - follow the simple setup here : https://www.postgresql.org/docs/10/static/logicaldecoding-example.html

        - insert a row

        - Do again : SELECT * FROM
        pg_logical_slot_get_changes('regression_slot', NULL, NULL);

        
        Do you see any of your oid image data in the output? Do you see
        any of the bytea ? (the answer here in 9.5 is  "no"/"yes").

        If in 10.0 is still the case, then you should think about moving
        to bytea.
       
      Hm, it turns out it's not quite that simple...
       
      Test-case:
       
      create table drus(id bigint primary key, lo
            oid, data bytea);
      SELECT * FROM
            pg_create_logical_replication_slot('my_slot',
            'test_decoding');

        INSERT INTO drus (id, lo, data) values(1,
            lo_import('/tmp/faktura_200007.pdf'), decode('AAAEEE',
            'hex'));
      

        select * from drus; 

          ┌────┬─────────┬──────────┐
          

          │ id │   lo    │   data   │

          ├────┼─────────┼──────────┤

          │  1 │ 2873269 │ \xaaaeee │

          └────┴─────────┴──────────┘

        
        SELECT * FROM pg_logical_slot_get_changes('my_slot',
            NULL, NULL); 

┌────────────┬──────┬────────────────────────────────────────────────────────────────────────────────┐

          │    lsn     │ xid  │
                                               data
                                               │

├────────────┼──────┼────────────────────────────────────────────────────────────────────────────────┤

          │ B/E585B858 │ 9391 │ BEGIN 9391
                                                                              │

          │ B/E586BE78 │ 9391 │ table public.drus: INSERT: id[bigint]:1
          lo[oid]:2873269 data[bytea]:'\xaaaeee' │

          │ B/E586BF80 │ 9391 │ COMMIT 9391
                                                                             │

└────────────┴──────┴────────────────────────────────────────────────────────────────────────────────┘

          (3 rows)

        
      So far so good, the oid-value (2873269) is apparently in the change-set,
        but...
       
    
    If the data itself of the LO are not there then this is not so good.

    
      Set up publication:
      CREATE PUBLICATION bolle FOR ALL TABLES; 

          CREATE PUBLICATION

        
        === ON REPLICA ===
       
      # create table on replica:
      create
          table drus(id bigint primary key, lo oid, data bytea);
       
      # create subscription:
      CREATE SUBSCRIPTION mysub CONNECTION
            'host=localhost port=5433 user=andreak dbname=fisk'
            PUBLICATION bolle;

        NOTICE:  created replication slot "mysub" on
            publisher 

          CREATE SUBSCRIPTION

          2017-06-22 16:38:34.740
            CEST [18718] LOG:  logical replication apply worker for
            subscription "mysub" has started 

          2017-06-22 16:38:34.747 CEST [18720] LOG:  logical replication
          table synchronization worker for subscription "mysub", table
          "drus" has started

          2017-06-22 16:38:35.746 CEST [18720] LOG:  logical replication
          table synchronization worker for subscription "mysub", table
          "drus" has finished

        
        Looks good:
      select * from drus; 

          ┌────┬─────────┬──────────┐

          │ id │   lo    │   data   │

          ├────┼─────────┼──────────┤

          │  1 │ 2873269 │ \xaaaeee │

          └────┴─────────┴──────────┘

          (1 row)

        
      ...until :
       
      SELECT lo_export(drus.lo, '/tmp/faktura.pdf')
            from drus where id = 1; 

          2017-06-22 16:40:04.967 CEST [18657] ERROR:  large object
          2873269 does not exist

          2017-06-22 16:40:04.967 CEST [18657] STATEMENT:  SELECT
          lo_export(drus.lo, '/tmp/faktura.pdf') from drus where id = 1;

          ERROR:  large object 2873269 does not exist

        
        So, the OID-value is replicated but pg_largeobject is empty:
       
      select * from pg_largeobject; 

          ┌──────┬────────┬──────┐

          │ loid │ pageno │ data │

          ├──────┼────────┼──────┤

          └──────┴────────┴──────┘

          (0 rows)

        
      Once again having pg_largeobject as a system-catalog prevents
        LOs from working smoothly. Neither replication nor having LOs on
        a different tablespace (by moving pg_largeobject) works.
       
    
    I think logical decoding was designed for supporting DML SQL
    commands (i.e. a finite set of commands) and not specific functions
    (lo_*) which by nature can be arbitrary, infinite and version
    specific. 

    
      I wish PG in some future version will address these quirks so
        one can operate on LOs more smoothly.
    
    What's so better in LO's VS bytea? You do a lot updates on the
    binary data, changing only parts of it?

    
        --
          Andreas
                Joseph Krogh
          CTO
              / Partner - Visena AS
          Mobile:
              +47 909 56 963
          andreas@xxxxxxxxxx
          www.visena.com
          
        
    -- 
Achilleas Mantzios
IT DEV Lead
IT DEPT
Dynacom Tankers Mgmt