Greetings, * Vincenzo Romano (vincenzo.romano@xxxxxxxxxxx) wrote: > 2017-12-19 17:32 GMT+01:00 Stephen Frost <sfrost@xxxxxxxxxxx>: > > * Vincenzo Romano (vincenzo.romano@xxxxxxxxxxx) wrote: > >> 2017-12-19 16:48 GMT+01:00 Stephen Frost <sfrost@xxxxxxxxxxx>: > >> > * Vincenzo Romano (vincenzo.romano@xxxxxxxxxxx) wrote: > >> >> Sorry, my bad: I confused V10 with v11. > >> >> But accordingly to a discussion with Bruce Momjan, table partitionin V10 is > >> >> little more than syntactic sugar around old-fashioned table partitioning. > >> > > >> > Well, it's a bit more than that since there's tuple-routing, but you're > >> > right that the partition elimination is the same as it was in earlier > >> > versions and based on constraint exclusion. That said, as noted in the > >> > email you replied to, reasonable numbers of partitions aren't too bad > >> > even with the planning cost; it's when you have many thousands of > >> > partitions that you get into cases where planning time for queries is > >> > really bad. > >> > >> When you have to handle a 100TB table, the number of partitions > >> shouldn't be 10 or 12 > >> as seen in most examples and tests. > > > > Probably not, but it would depend on the data and what you're doing with > > it. While there are some general rules-of-thumb, there's no one right > > answer when it comes to the size of individual partitions. > > I think partitioning is done in order to make smaller tables or to > balance table size and table number. > You currently have to keep in mind a limit to the number of sub-tables. > This makes partitioning less effective. Sure, but that doesn't make PG unsuitable for such use-cases, just that you have to realize these costs and address them. > >> This is the same type of issues you hit with partial indexes (this is > >> why I mentioned them earlier). > >> Sub-table (and partial index) selection algorithm should be > >> logarithmic or sub-linear. > > > > Sure, and work is being done to improve PG in that exact area, but that > > doesn't mean it can't handle workloads like this today, but you have to > > be aware of the costs associated with today's partitions. > > > >> As long as it'll be linear, you'll hear about "reasonable number of partitions". > >> One thousand partitions for a 100TB table would make "manageable" > >> 100GB sub-tables. > > > > Yup, and 100G tables are certainly large and a bit awkward but they can > > be managed. > > Of course. But I would bet they are still considered as "very large tables". They're not small. :) > >> I could be easily wrong, but this is an are where PG needs improvements. > > > > Of course, and work is being done to improve it. What's important is > > knowing that there's a cost to having more partitions when querying > > through the parent when you get to a point where you have thousands of > > partitions. That cost may be perfectly fine in some use-cases and in > > others it might not be, and instead you'd likely have to build logic > > into the application layer to address it. That's not ideal, which is > > why there's work being done to improve PG, but it's not necessairly a > > big deal either. Dynamic SQL is another approach. > > > >> One could maybe think about multi-level partitioning, though. > > > > Not sure exactly what you're referring to here, but a simple multi-level > > partitioning setup with PG wouldn't actually change the cost for > > partition elimination today. > > It'd be a tree of tables, pointing to logarithmic selection. > This is why we love tree data structures. Sure, but that doesn't matter when it comes to how constraint exclusion works with PostgreSQL today- all partitions under a given parent are considered when doing partition elimination. > >> > Also as noted on this thread, PG could handle this data volume, but to > >> > be efficient there would be work to be done in normalization, > >> > aggregation, and analyzing the system to ensure you're storing and > >> > querying on the data efficiently. > >> > >> Normalization will grow the number of tables (and later joins) and > >> you'll will very likely end up with at least a table with a > >> "gazillion" rows. > > > > Natuarlly, but that "gazillion" rows table would be much smaller for > > having the data normalized- if you don't normalize it then the gazillion > > row table is a huge amount of duplicated data, making the entire system > > much larger than necessary. > > Partitioning is done for other reasons than de-duplicating data. Yes, I wasn't suggesting to use partitioning to de-duplicate data, I was suggesting to use normalization for that. > The number of rows to be analyzed would be still the same and the > indexes over those columns would still be rather large. While true, the indexes would be smaller than they would be without the data being normalized. > >> I fear normalization, provided it's really needed, would provide little help. > > > > I seriously doubt that's the case. Normalization might reduce that > > 100TB down to 10's of TB instead, or perhaps even smaller. > > A 10x duplication factor seems very bad to me. I'm not sure why, it's not actually that uncommon. > If that's the case then normalization would provide for some > improvement while eating resources to be done. It's unclear what you're referring to here regarding 'eating resources'. Clearly there's some CPU time that's consumed when doing a join, but that's generally far cheaper than the time required to pull data off of disk because it's not in memory. > Storage nowadays isn't a big issue. Querying is. Sure, and if all you had to pay was the storage cost then no one would care- but you also have to pay the memory cost and that's a much bigger deal. If you have to go out to disk constantly then you've basically already lost when it comes to query time. > >> With tables that big I usually do "software partitioning". > >> I make the software aware of the partition schema so it can direct the > >> queries to the needed (sub-)tables. > > > > Yes, implementing partition elimination at the application level is > > certainly an approach that can work with PG. Even then, ideally, you'd > > normalize the data so that the footprint is much smaller, making more > > efficient use of the storage and memory available. > > Normalizing can save resources that you later need to join tables back. You were commenting above on how normalization eats resources and now it's saving them. :) In my experience, the join cost is nothing compared to the cost to pull data off of disk because you couldn't fit it all in memory. > It's a matter of trade-offs and it strictly depends on the queries to > be run in my opinion. I agree that it's a matter of trade-offs and certainly depends on the queries being run, though none of that information is available here. > Effective table partitioning, instead, seems to me to have bigger importance. They're both important, imv. > My bottom line here is: table partitioning is nice, but not ready yet > for the big data where it makes the bigger sense. I agree that there's more work to be done there, certainly, but it's also getting better and there are ways to work around the existing limitations. Thanks! Stephen
Attachment:
signature.asc
Description: Digital signature