On Wed, Jul 20, 2011 at 11:57 AM, Jose Ildefonso Camargo Tolosa <ildefonso.camargo@xxxxxxxxx> wrote: > On Tue, Jul 19, 2011 at 3:57 PM, Robert Haas <robertmhaas@xxxxxxxxx> wrote: >> On Sun, Jun 19, 2011 at 10:19 PM, Jose Ildefonso Camargo Tolosa >> <ildefonso.camargo@xxxxxxxxx> wrote: >> > So, the question is, if I were to store 8TB worth of data into large >> > objects system, it would actually make the pg_largeobject table slow, >> > unless it was automatically partitioned. >> >> I think it's a bit of an oversimplification to say that large, >> unpartitioned tables are automatically going to be slow. Suppose you >> had 100 tables that were each 80GB instead of one table that is 8TB. >> The index lookups would be a bit faster on the smaller tables, but it >> would take you some non-zero amount of time to figure out which index >> to read in the first place. It's not clear that you are really >> gaining all that much. > > Certainly.... but it is still very blurry to me on *when* it is better to > partition than not. I think that figuring that out is as much an art as it is a science. It's better to partition when most of your queries are going to touch only a single partition; when you are likely to want to remove partitions in their entirety; when VACUUM starts to have trouble keeping up... but the reality is that in some cases you probably have to try it both ways and see which one works better. >> Many of the advantages of partitioning have to do with maintenance >> tasks. For example, if you gather data on a daily basis, it's faster >> to drop the partition that contains Thursday's data than it is to do a >> DELETE that finds the rows and deletes them one at a time. And VACUUM >> can be a problem on very large tables as well, because only one VACUUM >> can run on a table at any given time. If the frequency with which the >> table needs to be vacuumed is less than the time it takes for VACUUM >> to complete, then you've got a problem. > > And.... pg_largeobject table doesn't get vacuumed? I mean, isn't that table > just as any other table? Yes, it is. So, I agree: putting 8TB of data in there is probably going to hurt. >> But I think that if we want to optimize pg_largeobject, we'd probably >> gain a lot more by switching to a different storage format than we >> could ever gain by partitioning the table. For example, we might >> decide that any object larger than 16MB should be stored in its own >> file. Even somewhat smaller objects would likely benefit from being >> stored in larger chunks - say, a bunch of 64kB chunks, with any >> overage stored in the 2kB chunks we use now. While this might be an >> interesting project, it's probably not going to be anyone's top >> priority, because it would be a lot of work for the amount of benefit >> you'd get. There's an easy workaround: store the files in the >> filesystem, and a path to those files in the database. > > Ok, one reason for storing a file *in* the DB is to be able to do PITR of a > wrongly deleted files (or overwritten, and that kind of stuff), on the > filesystem level you would need a versioning filesystem (and I don't, yet, > know any that is stable in the Linux world). > > Also, you can use streaming replication and at the same time you stream your > data, your files are also streamed to a secondary server (yes, on the > FS-level you could use drbd or similar). Well, those are good arguments for putting the functionality in the database and making it all play nicely with write-ahead logging. But nobody's felt motivated to write the code yet, so... -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance