On 01/04/2017 09:59 AM, Israel Brewster wrote:
Short version:
Do indexes impact the speed of an UPDATE, even when the indexed
columns aren't changing?
Details:
I have a table containing geographical data (Latitude, longitude, and
elevation) with 406,833,705 records. The Latitude and Longitude
columns are indexed. In order to better utilize the data, I've been
looking into PostGIS, and decided I wanted to add a "Location" column
with PostGIS type "GEOGRAPHY(point)". I then tried to populate it from
the existing latitude/longitude data using the following query:
UPDATE data SET
location=ST_GeographyFromText('SRID=4326;POINT('||lng::text||'
'||lat::text||')');
I expected this update to take quite a while, since it has 406 million
rows to update, but at this point it's been over 19 hours since I
started the query, and it still hasn't completed.
I'm wondering if the presence of the indexes could be slowing things
down even though the indexed columns aren't being updated? Would I be
better off canceling the update query, dropping the indexes, and
trying again? Or is more likely that the update query is "almost"
done, and it would be better to just let it run it's course? Or is
there an even better option, such as perhaps exporting the data,
adding the additional column in a text editor, and re-importing the
data with a COPY command?
Thanks for any feedback/advice you can offer!
I would use a "create table redo as select *,
location=ST_GeographyFromText('SRID=4326;POINT('||lng::text||'
'||lat::text||')') from original;" then index that and drop original.
Or just "create table location as select
location=ST_GeographyFromText('SRID=4326;POINT('||lng::text||'
'||lat::text||')');" along with what ever id you have for the original
tuple (if it's not just lat+lon) and join or view as necessary after
indexing.
--
Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general