Re: Maintenance question / DB size anomaly...

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Okay,


 select * from _my_cluster.sl_log_1 where ctid = '(1,1)';
 select * from _my_cluster.sl_log_1 where ctid = '(1,2)';
 select * from _my_cluster.sl_log_1 where ctid = '(1,3)';
 select * from _my_cluster.sl_log_1 where ctid = '(1,4)';

all returns zero rows.  When I do a dump of that file, I get:

Block    1 ********************************************************
<Header> -----
Block Offset: 0x00002000         Offsets: Lower     408 (0x0198)
Block: Size 8192  Version    2            Upper    7680 (0x1e00)
LSN:  logid    955 recoff 0x0daed68c      Special  8192 (0x2000)
Items:   97                   Free Space: 7272
Length (including item array): 412

<Data> ------
Item   1 -- Length:  121  Offset: 8068 (0x1f84)  Flags: USED
  XMIN: 1491480520  CMIN: 1  XMAX: 0  CMAX|XVAC: 0
  Block Id: 1  linp Index: 1   Attributes: 6   Size: 32
  infomask: 0x0912 (HASVARWIDTH|HASOID|XMIN_COMMITTED|XMAX_INVALID)

...the fact that they weren't in the table, but in the file (I did the filedump first, then the query), then redid the filedump, the results are the same, the rows are still in the file. I have no idea how frequently these files are getting written to, I assume frequently. I also looked at the last block listed in the file, 6445, and also looked for items 1-4, and also did not find them in the table using a similar select as above. That seems kinda strange, since there's right this second 11,000 items in that table, but I'll roll with it for awhile.

Intrigued, I wanted to see what a filedump looked like of a row that WAS in the table:

ctid | log_origin | log_xid | log_tableid | log_actionseq | log_cmdtype | (7,1) | 10 | 1491481037 | 8 | 473490934 | I | (memberid,answerid,taskinstanceid) values ('144854','148707','0')



Block    7 ********************************************************
<Header> -----
Block Offset: 0x0000e000         Offsets: Lower     424 (0x01a8)
Block: Size 8192  Version    2            Upper     508 (0x01fc)
LSN:  logid    955 recoff 0x0dc4bcc0      Special  8192 (0x2000)
Items:  101                   Free Space:   84
Length (including item array): 428

<Data> ------
Item   1 -- Length:  129  Offset: 8060 (0x1f7c)  Flags: USED
  XMIN: 1491481037  CMIN: 7  XMAX: 0  CMAX|XVAC: 0
  Block Id: 7  linp Index: 1   Attributes: 6   Size: 32
  infomask: 0x0912 (HASVARWIDTH|HASOID|XMIN_COMMITTED|XMAX_INVALID)


...the NextID was (taken about 5 mins after the previous filedump):
 Latest checkpoint's NextXID:          1491498183


I don't see any real differences in the file entry for a row that is in the table, and one that I don't see in the table. I hope I'm getting this right, its totally fascinating seeing how
all this works.

About your other questions:

1. I have pg_clog segments all the way back to the start of the database, all the way back to March 14th, 2006 (most likely when the database was first brought up on this machine). The numeric names start at 0000 and go to 058E. I checked the recent (within last 8 days) and saw no errors containing the word 'checkpoint'. In fact, very few errors at all.
The dang thing seems to be running pretty well, just a little slow.

mydb=# select datname, age(datfrozenxid) from pg_database;
  datname  |    age
-----------+------------
template1 | 1491520270
template0 | 1491520270
postgres  | 1491520270
mydb      | 1076194904


Oooooooo..... thats not good, is it? Thanks for taking an interest, Tom. I'm most likely going to promote one of my subscribers to be master, then nuke this database, but I have no problems keeping it around if you think I may have found some obscure bug that could help someone debug. Again, this DB gets vacuumed every day, and in the beginning, I think I remember doing a vacuum full every
day.

Thanks,

/kurt


On Jun 20, 2007, at 5:08 PM, Tom Lane wrote:

Kurt Overberg <kurt@xxxxxxxxxxxxxxxxx> writes:
Drat!  I'm wrong again.  I thought for sure there wouldn't be a
wraparound problem.

Well, I'm not sure what it is now.  You showed some invisible tuples
with XMINs of
   XMIN: 1489323584  CMIN: 1  XMAX: 0  CMAX|XVAC: 0
   XMIN: 1489323590  CMIN: 2  XMAX: 0  CMAX|XVAC: 0
   XMIN: 1489323592  CMIN: 1  XMAX: 0  CMAX|XVAC: 0
but the nextXID is
         1490547335
which is not that far ahead of those --- about 1.2 million transactions,
or less than a day's time according to the clog timestamps, which
suggest that you're burning several million XIDs a day. Perhaps you've
wrapped past them since your earlier check --- if you try the same
"select where ctid = " queries now, do they show rows?

The other thing that's strange here is that an 8.0 installation should
be pretty aggressive about recycling pg_clog segments, and yet you've
got a bunch there.  How far back do the files in pg_clog go --- what's
the numeric range of the filenames, and the date range of their mod
times? Have you checked the postmaster log to see if you're getting any
complaints about checkpoint failures or anything like that?  It would
also be useful to look at the output of
select datname, age(datfrozenxid) from pg_database;

			regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 3: Have you checked our extensive FAQ?

               http://www.postgresql.org/docs/faq



[Postgresql General]     [Postgresql PHP]     [PHP Users]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Books]     [PHP Databases]     [Yosemite]

  Powered by Linux