Search Postgresql Archives

Re: strange table disk sizes

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, Sep 1, 2011 at 6:38 AM, Rik Bellens <rik.bellens@xxxxxxxxxxxxxx> wrote:
> Op 01-09-11 14:22, Scott Marlowe schreef:
>>
>> On Thu, Sep 1, 2011 at 6:05 AM, Rik Bellens<rik.bellens@xxxxxxxxxxxxxx>
>>  wrote:
>>>
>>> Op 01-09-11 13:31, Scott Marlowe schreef:
>>>>
>>>> On Thu, Sep 1, 2011 at 4:32 AM, Rik Bellens<rik.bellens@xxxxxxxxxxxxxx>
>>>>  wrote:
>>>>>
>>>>> Hello,
>>>>>
>>>>> I have two large tables in a database, one containing original data and
>>>>> the
>>>>> other one derived from the first table. The first table contains
>>>>> several
>>>>> columns and indexes, while the second table has less columns and only
>>>>> one
>>>>> index. Both tables have the same number of rows. Nevertheless, the
>>>>> second
>>>>> table is much larger in disk size than the first one. How can this be
>>>>> explained?
>>>>
>>>> This is most likely due to table bloat.  In PostgreSQL when you update
>>>> or delete a row, a dead version gets left behind.  Vacuum eventually
>>>> comes along and reclaims the empty space to be reused.  If you delete
>>>> / update a LOT of rows at once, then you'll have a lot of dead rows
>>>> which can only be reused after vacuuming when you do more updates or
>>>> deletes later on.
>>>>
>>>> A few salient questions. What version of PostgreSQL are you running?
>>>> Is autovacuum running? Do you do a LOT of bulk deletes / updates?  If
>>>> you do a lot of bulk deletes on this table, and you delete everything,
>>>> can you switch to using the truncate command instead?
>>>
>>> I use version 8.3. I see the 'autovacuum launcher process' and
>>> 'autovacuum
>>> worker process' in the process list, so I suppose autovacuum is running.
>>>
>>> Rows in the measurement table are added once and never deleted or
>>> updated.
>>> Adding a row to this table triggers a function that adds a row to the
>>> stats_count table. Normally rows are added chronologically. So rows in
>>> the
>>> stats_count table are normally not updated either. If however, for some
>>> reason, a measurement is added from an older time, all rows of that
>>> device
>>> which come after this time, are updated, but I don't think this will
>>> happen
>>> very often.
>>>
>>> The table 'stats_count' was created in a later stage, so the first 45M
>>> rows
>>> were added at once and chronologically. However, because the function to
>>> initialize this table took a long time and the client application crashed
>>> a
>>> few times, I had to restart this function several times. Can it be that
>>> there is some trash left from running this function several times without
>>> finishing it? Would it be a solution to run 'VACUUM FULL' to reclaim some
>>> disk space?
>>
>> Yeah, could be.  Take a look at this page:
>> http://wiki.postgresql.org/wiki/Show_database_bloat and see if the
>> query there sheds some light on your situ.
>
> thanks for this answer
>
> if i run the query, I get 12433752064 wasted bytes on stats_count_pkey, so I
> suppose that is the reason

Sounds like it.  Take a look here:
http://wiki.postgresql.org/wiki/Index_Maintenance

-- 
Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general



[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [Postgresql Jobs]     [Postgresql Admin]     [Postgresql Performance]     [Linux Clusters]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Books]     [PHP Databases]     [Postgresql & PHP]     [Yosemite]
  Powered by Linux