On 11/02/15 13:52, Mathieu Basille wrote:
Dear PostgreSQL users,
I am posting here a question that I initially asked on the PostGIS
list [1], where I was advised to try here too (I will keep both lists
updated about the developments on this issue).
I am currently planning to set up a PostgreSQL + PostGIS instance for
my lab. Turns out I believe this would be useful for the whole center,
so that I'm now considering setting up the server for everyone—if
interest is shared of course. At the moment, I am however struggling
with what would be required in terms of hardware, and of course, the
cost will depend on that—at the end of the day, it's really a matter
of money well spent. I have then a series of questions/remarks, and I
would welcome any feedback from people with existing experience on
setting up a multi-user PostGIS server. I'm insisting on the PostGIS
aspect, since the most heavy requests will be GIS requests
(intersections, spatial queries, etc.). However, people with similar
PostgreSQL setup may have very relevant comments about their own
configuration.
* My own experience about servers is rather limited: I used PostGIS
quite a bit, but only on a desktop, with only 2 users. The desktop was
quite good (quad-core Xeon, 12 Go RAM, 500 GB hd), running Debian, and
we never had any performance issue (although some queries were rather
long, but still acceptable).
* The use case I'm envisioning would be (at least in the foreseeable
future):
- About 10 faculty users (which means potentially a little bit more
students using it); I would have hard time considering more than 4
concurrent users;
- Data would primarily involve a lot (hundreds/thousands) of high
resolution (spatial and temporal) raster and vector maps, possibly
over large areas (Florida / USA / continental), as well as potentially
millions of GPS records (animals individually monitored);
- Queries will primarily involve retrieving points/maps over given
areas/time, as well as intersecting points over environmental layers
[from what I understand, a lot of I/O, with many intermediary tables
involved]; other use cases will involve working with steps, i.e. the
straight line segment connecting two successive locations, and
intersecting them with environmental layers;
* I couldn't find comprehensive or detailed guidelines on-line about
hardware, but from what I could see, it seems that memory wouldn't be
the main issue, but the number of cores would be (one core per
database connection if I'm not mistaken). At the same time, we want to
make sure that the experience is smooth for everyone... I was advised
on the PostGIS list to give a look at pgpool (however, UNIX only).
* Is there a difference in terms of possibilities, performance and
usability between a Linux-based and a MS-based server (from the user
perspective)? My center is unfortunately MS-centered, and existing
equipment runs with MS systems... It would thus be easier for them to
set up a MS-based server. Does it change anything for the user? (I
insist on the user perspective, since I and others will not admin the
system, but only use it)
* Does anyone have worked with a server running the DB engine, while
the DB itself was stored on another box/server? That would likely be
the case here since we already have a dedicated box for file storage.
Along these lines, does the system of the file storage box matter
(Linux vs. MS)?
* We may also use the server as a workstation to streamline PostGIS
processing with further R analyses/modeling (or even use R from within
the database using PL/R). Again, does anyone have experience doing it?
Is a single workstation the recommended way to work with such
workflow? Or would it be better (but more costly) to have one server
dedicated to PostGIS and another one, with different specs, dedicated
to analyses (R)?
I realize my questions and comments may be a confusing, likely because
of the lack of experience about these issues on my side. I really
welcome any feedback of people working with PostgreSQL servers (+
PostGIS ideally!) in a small unit, or any similar setting that could
be informative!
In advance, thank you very much!
Sincerely,
Mathieu Basille.
[1] Start of the thread here:
http://lists.osgeo.org/pipermail/postgis-users/2015-February/040120.html
When I looked at tuning options for PostgreSQL, I found that there were
limitations mentioned for Microsoft O/S's.
I get the general impression from my reading from multiple sources over
the years, that if you are serious about performance on a server, then
you should prefer Linux to Microsoft. Note that most servers run Linux,
and that over 95% of the top 500 super computers runs Linux - which is
rather telling about the perception of Microsoft's performance.
Cheers,
Gavin
--
Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general