Search Postgresql Archives

Re: Proper use of pg_xlog_location_diff()

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 





On 16/01/15 14:37, Jim Nasby wrote:
On 1/15/15 7:12 PM, Fabio Ugo Venchiarutti wrote:
Greetings


Our company is writing a small ad-hoc implementation of a load
balancer for Postgres (`version()` = PostgreSQL 9.2.9 on
x86_64-unknown-linux-gnu, compiled by gcc (GCC) 4.4.7 20120313 (Red
Hat 4.4.7-4), 64-bit).

We're using both streaming and WAL shipping based replication.


Most mainstream solutions seem to implement load balancing with plain
round robin over a connection pool. Given that our cloud nodes are
diversely capable and subject to noisy neighborhood conditions, we
need to factor in instantaneous load profiles (We achieved this by
exporting some /sys and /proc paths through custom views and
everything works as expected).


We're now adding functionality to temporarily blacklist hot standby
clusters based on their WAL records lag and pg_xlog_location_diff()
seems to be the key tool for this, but we're perhaps misusing it.


The current draft implementation uses the following queries and
compares the output to determine how many bytes a given slave is lagging.
Is there any shortcoming to such approach?


--------------------------------
-- ON MASTER:
--------------------------------
SELECT
     pg_xlog_location_diff(pg_current_xlog_location(), '000/00000000')
;
--------------------------------

That's very nonsensical; it will always return the same thing as
pg_current_xlog_location.

--------------------------------
-- ON STANDBY:
--------------------------------
SELECT
     pg_xlog_location_diff(
         COALESCE(
             pg_last_xlog_receive_location(),

Note that that is the xlog location that has been *sync'd to disk*. That
could potentially lag significantly behind the master's LSN. I think
your safest bet would be getting pg_current_xlog_location from the
master and subtracting pg_last_xlog_replay_location() from it (but note
you could get a negative result).

BTW,
http://www.postgresql.org/docs/devel/static/warm-standby.html#STREAMING-REPLICATION
says to use pg_last_xlog_receive_location() instead of
pg_last_xlog_replay_location() because it tells you what's committed to
disk on a standby vs what's visible. But for what you're doing I think
you want pg_last_xlog_replay_location().

Also, I don't think you should coalesce. If you get a NULL for any of
this then something's almost certainly wrong (like a server is
misconfigured). If you were going to coalesce I'd say you should
coalesce to 2^63-1.


Thank you

>> --------------------------------
>> -- ON MASTER:
>> --------------------------------
>> SELECT
>>      pg_xlog_location_diff(pg_current_xlog_location(), '000/00000000')
>> ;
>> --------------------------------
>
> That's very nonsensical; it will always return the same thing as
> pg_current_xlog_location.

I know it seems odd but I'm just using the query itself to keep everything in decimals as my safety threshold is expressed in bytes.

The question this query asks is "how many WAL bytes does this cluster know about since the hypothetical record 000/00000000"?

Then I do the math.

The actual full form is the same for both master and standbys (already changed based on your input):

--------------------------------
SELECT
	pg_is_in_recovery() AS in_recovery,
	pg_xlog_location_diff(
		(CASE
			WHEN (pg_is_in_recovery()) THEN
				pg_last_xlog_replay_location()
			ELSE
				pg_current_xlog_location()
		END),
		'000/00000000'
	) AS total_wal_offset
--------------------------------


> says to use pg_last_xlog_receive_location() instead of
> pg_last_xlog_replay_location() because it tells you what's committed to
> disk on a standby vs what's visible. But for what you're doing I think
> you want pg_last_xlog_replay_location().

You basically framed what my question boils down to.

I'm trying to get my head around http://www.postgresql.org/docs/9.2/static/functions-admin.html#FUNCTIONS-RECOVERY-INFO-TABLE

Does it mean that pg_last_xlog_receive_location() returns the last WAL record that has been successfully "staged for replay" by the stream replication whereas pg_last_xlog_replay_location() returns the last successful WAL replay regardless of it coming from streaming or archive shipping? As in 2 different stages the first of which is basically irrelevant to record visibility?

Many thanks


Fabio



--
Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general



[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [Postgresql Jobs]     [Postgresql Admin]     [Postgresql Performance]     [Linux Clusters]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Books]     [PHP Databases]     [Postgresql & PHP]     [Yosemite]
  Powered by Linux