@Justin
@Merlin @ Jeff,
Thanks so much for your time and insights, they improved our understanding of the underpinnings of PostgreSQL and allowed us to deal the issues we were facing.
Using parallel query on our PG 9.6 improved a lot the query performance - it turns out that a lot of our real world queries could benefit of parallel query, we saw about 4x improvements after turning it on, and now we see much higher storage IOPS thanks to the multiple workers.
On our tests effective_io_concurrency did not show such a large effect as the link you sent, I'll have a new look at it, maybe we are doing something wrong or the fact that the SSDs are on the SAN and not local affects the results.
On the process we also learned that changing the default Linux I/O scheduler from CFQ to Deadline worked wonders for our Dell SC2020 SAN Storage setup, we used to see latency peaks of 6,000 milliseconds on busy periods (yes, 6 seconds), we now see 80 milliseconds, an almost 100 fold improvement.
Best regards,
Haroldo Kerry
On Wed, Jan 9, 2019 at 5:14 PM Merlin Moncure <mmoncure@xxxxxxxxx> wrote:
On Thu, Dec 27, 2018 at 7:29 PM Justin Pryzby <pryzby@xxxxxxxxxxxxx> wrote:
>
> On Thu, Dec 27, 2018 at 08:20:23PM -0500, Jeff Janes wrote:
> > Also, you would want to use the newest version of PostgreSQL, as 9.6
> > doesn't have parallel query, which is much more generally applicable than
> > effective_io_concurrency is.
effective_io_concurrency only applies to certain queries. When it
does apply it can work wonders. See:
https://www.postgresql.org/message-id/CAHyXU0yiVvfQAnR9cyH=HWh1WbLRsioe=mzRJTHwtr=2azsTdQ@xxxxxxxxxxxxxx
for an example of how it can benefit.
parallel query is not going to help single threaded pg_bench results.
you are going to be entirely latency bound (network from bebench to
postgres, then postgres to storage). On my dell crapbox I was getting
2200tps so you have some point of slowness relative to me, probably
not the disk itself.
Geetting faster performance is an age-old problem; you need to
aggregate specific requests into more general ones, move the
controlling logic into the database itself, or use various other
strategies. Lowering latency is a hardware problem and can force
trade-offs (like, don't use a SAN) and has specific boundaries that
are not easy to bust through.
merlin
Haroldo Kerry CTO/COO Rua do Rócio, 220, 7° andar, conjunto 72 São Paulo – SP / CEP 04552-000 |