Search Postgresql Archives

a simple-minded question about updating

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



I work with Postgres and wonder whether for my purposes there is a good-enough reason to update one of these days.

 

I’m an editor working with some 60,000 Early Modern texts, many of them in need of some editorial attention. The texts are XM encoded documents. Each word is wrapped in a <w> element with attributes for various linguistic metadata. Typically a type of error occurs several or many times, and at the margins they need individual attention. I use Python scripts to extract stuff from the main corpus—sometimes dozens, sometimes thousands or millions—turn them into keyword in contexts and import them into Postgres. I basically use Postgres as a giant spreadsheet.  Its excellent string-handling routines make it relatively easy to to perform search and sort operations that identify tokens in need of correction. Once they corrections are made in Postgres—typically as batch updates-- I move them as a data frame into Python, and from Python I move them back into the texts.

 

I do this on a recent Mac with 64 GB of memory and a 6 cor i& processor.  I use Data Studio as an editing interface.

 

Unless a more recent version of Postgress has additional string handling routines, or indexing routines that speed up working with tables with rows in the low millions, or other features that are likely to speed up operations, I don’t see any reasons to update.

 

I could imagine a table that has up to 40 million rows.  That would be pretty sluggish on my current equipment, which handles up to 10 million rows quite comfortably.

 

A I right in thinking that given my tasks and equipment it would be a waste of time to update? Or is there something I’m missing?

 

Martin Mueller

Professor emeritus of English and Classiccs

Northwestern University


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [Postgresql Jobs]     [Postgresql Admin]     [Postgresql Performance]     [Linux Clusters]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Databases]     [Postgresql & PHP]     [Yosemite]

  Powered by Linux