Re: Migrating a FoxPro system and would like input on the best way to achieve optimal performance

David G Johnston <david.g.johnston@xxxxxxxxx> · Mon, 9 Feb 2015 11:17:52 -0700 (MST)

TonyS wrote
> Then each client has files within their own directory to keep the size of
> the tables manageable. Each client has 165 tables. These tables are all
> the same definition across the different groups.
> 
> I have considered partitioning tables, but if I am correct that would
> result in 330,000 files and I am not certain if that will cause an issue
> with degraded file system performance.

I suggest you not think about "files" when pondering about PostgreSQL.  That
said, 330,000 tables within a single database, or even cluster, is likely to
be problematic.

> Is it possible to create a tablespace for each group and then create
> partitioned tables for the groups within the group's tablespace to limit
> the number of files in a directory? 

Same point about ignoring "files" and "directories".  Tablespaces let you
place different kinds of data onto different filesystems; using them for
"directory management" is not particularly helpful.

Note that I presume you are planning on leaving the database backend on
Windows...my experience is more with Linux but your core issue is data model
which is largely O/S agnostic.

> I plan on utilizing the built-in streaming replication, so I assume if I
> went the tablespace route I would need to create directories for all
> future groups from the outset since creating them individually with code
> on the backup systems would be difficult.

Which is another reason why tablespaces should not implement logical
attributes of the system.

> Another option would be placing an extra field in each table identifying
> the group it belongs to and combining all of the separate tables of the
> same definition into one table. This would result in some tables having
> 300 million entries currently and that would climb over the next 18
> months.

This is the canonical solution to multi-tenancy.  Physical partitioning then
occurs on a hash of whatever key you are using; you do not have one tenant
per table.

> The final option I can see is creating a schema for each of the different
> clients. I am not certain if this is a better option than partitioned
> tables. I haven't been able to determine if schema objects are stored in a
> sub directory or if they are in the same directory as all of the other
> tables. If they are in the same directory then the same issue arises as
> the partitioned tables.

Depending on whether clients are able to get access to the data directly you
can also consider having a separate database for each client.  I would then
recommend using either dblink or postgres_fdw to connect to the single
shared database - or just replicate the shared schema and data subset into
each individual client database.

> Of course, I am certain there are a number of other possibilities that I
> am overlooking. I am just trying to determine the best way to move this
> over and get things into a more modern system.

Without understanding how your application works and makes use of the
existing data it is difficult to suggest alternatives.  Specifically around
data visibility and the mechanics behind how the application access
different clients' data.

I would personally choose only between having different databases for each
client or using a "client_id" column in conjunction with a multi-tenant
database.  Those are the two logical models; everything else (e.g.
partitioning) are physical implementation details.

David J.

--
View this message in context: http://postgresql.nabble.com/Migrating-a-FoxPro-system-and-would-like-input-on-the-best-way-to-achieve-optimal-performance-tp5837211p5837241.html
Sent from the PostgreSQL - performance mailing list archive at Nabble.com.

-- 
Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance