Re: 600 million rows of data. Bad hardware or need partitioning?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, May 05, 2020 at 08:31:29PM -0400, Arya F wrote:
> On Mon, May 4, 2020 at 5:21 AM Justin Pryzby <pryzby@xxxxxxxxxxxxx> wrote:
> 
> > I mentioned in February and March that you should plan to set shared_buffers
> > to fit the indexes currently being updated.
> 
> The following command gives me
> 
> select pg_size_pretty (pg_indexes_size('test_table'));
>  pg_size_pretty >  5216 MB
> 
> So right now, the indexes on that table are taking about 5.2 GB, if a
> machine has 512 GB of RAM and SSDs, is it safe to assume I can achieve
> the same update that takes 1.5 minutes in less than 5 seconds while
> having 600 million rows of data without partitioning?

I am not prepared to guarantee server performance..

But, to my knowledge, you haven't configured shared_buffers at all.  Which I
think might be the single most important thing to configure for loading speed
(with indexes).

Couple months ago, you said your server had 4GB RAM, which isn't much, but if
shared_buffers is ~100MB, I think that deserves attention.

If you get good performance with a million rows and 32MB buffers, then you
could reasonably hope to get good performance (at least initially) with
100million rows and 320MB buffers.  Scale that up to whatever you expect your
index size to be.  Be conservative since you may need to add indexes later, and
you can expect they'll become bloated, so you may want to run a reindex job.

shared_buffers is frequently set to ~25% of RAM, and if you need to efficiently
use indexes larger than what that supports, then you should add RAM, or
implement partitioning.

-- 
Justin





[Postgresql General]     [Postgresql PHP]     [PHP Users]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Books]     [PHP Databases]     [Yosemite]

  Powered by Linux