On 21/12/12 02:34, Richard Neill wrote:
Reindexing only takes a couple of seconds, and restores correctness.
Interestingly, the partial index (after reindexing) is only 16kB in
size; whereas the table is 1.1 GB, and the normal single-column indexes
are about 250MB in size.
In terms of what's physically happening in reality,
- tbl_tracker keeps a record of all books that move through the system
over a period of one month (at a rate of about 20/second, or 1
million/day), after which they are deleted.
- the partial index, tbl_tracker_performance_1_idx tracks only those
books which are currently "in flight" - books remain in flight for
about 200 seconds as they go round the machine.
(While in flight, these have exit_state = NULL)
- the partial index is used to overcome a design defect(*) in the
sorter machine, namely that it doesn't number each book uniquely,
but wraps the parcel_id_code every few hours. Worse, some books can
remain on the sorter for several days (if they jam), so the numbering
isn't a clean "wraparound", but more like a fragmented (and
occasionally lossy) filesystem.
- What I'm trying to do is trace the history of the books
through the system and assign each one a proper unique id.
So, if I see a book with "parcel_id_code = 37",
is it a new book (after pid wrap), or is it the same book I saw 1
minute ago, that hasn't exited the sorter?
So... is there some way to, for example, set a trigger that will reindex
every time the index exceeds 1000 rows?
Richard
(*)Readers of The Daily WTF might appreciate another curious anomaly:
this machine originally had an RS-232 port; it now uses ethernet, but
TxD and RxD use different TCP sockets on different network ports!
--
Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance