Re: Where not to use PostgreSQL?

Merlin Moncure <mmoncure@xxxxxxxxx> · Fri, 1 Mar 2019 08:35:59 -0600

On Thu, Feb 28, 2019 at 6:24 AM Chris Travers <chris.travers@xxxxxxxxx> wrote:
>
> On Thu, Feb 28, 2019 at 1:09 PM Pavel Stehule <pavel.stehule@xxxxxxxxx> wrote:
>>
>> Hi
>>
>> čt 28. 2. 2019 v 12:47 odesílatel Thomas Güttler <guettliml@xxxxxxxxxxxxxxxxxx> napsal:
>>>
>>> Hi experts,
>>>
>>> where would you suggest someone to **not** use PostgreSQL?
>
>
> Hard question.  There are a lot of general places where PostgreSQL is not by itself the best fit, and where a careful weighing of pros and cons would need to be made before deciding to use it.
>
> Having used PostgreSQL in place of ElasticSearch for PB-scale deployments, as a high-throughput queue system, and near-real-time OLAP workloads, I am fairly aware of how hard it can be pushed.
>
> So the answers here are not "don't use PostgreSQL here" but "think about it first and consider alternatives."
>
>
>>>
>>>
>>> Why would you do this?
>
>
> Replacing with "What would you consider to be the tradeoffs?"
>>>
>>>
>>> What alternative would you suggest instead?
>
>
> So a few possibilities:
>
> 1.  a) TB-scale full text search systems.
>      b) PostgreSQL's full text search is quite capable but not so powerful that it can completely replace Lucene-based systems.  So you have to consider complexity vs functionality if you are tying with other data that is already in PostgreSQL.  Note further that my experience with at least ElasticSearch is that it is easier to scale something built on multiple PostgreSQL instances into the PB range than it is to scale ElasticSearch into the PB range.
>      c) Solr or ElasticSearch

In my company we had to swap out solr for postgres.  The main
challenge was that solr's limited query language was not able to deal
with complex authorization use cases that were coming in after the
original project was deployed.  Our only solution was to  heavily
denormalize the documents so that when simple changes happened on the
OLTP side we had to push large amounts of data into SOLR.  In addition
to being slow, solr (or I guess the underlying lucene) started to
develop weird failure modes; there were unpredictable swings in memory
and/or disk usage, underlying system resource exhaustion (especially
fd).  The whole thing felt unstable; we had tested heavily with the
original requirements but the tech did not evolve with the project.

The solution was to junk the whole thing and replace it with an API
compatible version of solr in the database.  To index the document we
use a special search string with upper case keys and lower case values
in a tab delimited text string; pg_trgm/gin does the rest of the
lifting.   It can't compete with solr on best case behavior but give
much better worst case behavior, and, since we don't have to
denormalize, the system fits within memory making scaling a snap.

The moral of the story here is 'Just use postgres'.  This is not
zealotry; if I were a microsoft inclined person, I might be advising
use of sql server.  If you are not totally and completely aware of the
limits of the system you are probably operating within them.  The
database is evolving rapidly and insanely powerful servers,
supercomputers even, from the perspective of even 10 years ago, can be
had for a mouse click on the cheap.

There of course a few use cases were postgres is not optimal tech;
highly unstructured data...super high transaction rate master master
loss tolerant data archiving, warm log storage, etc.  These problems
show up quite rarely in the world of data which is generally directed
towards systems support of business applications.  If you haven't
mastered the database first, you probably shouldn't be building out
complex systems in non-database technology since you don't know what
you don't know (for example, the immense value that transactions bring
to the table).

merlin

Re: Where **not** to use PostgreSQL?

Re: Where not to use PostgreSQL?