Shaheed Haque <shaheedhaque@xxxxxxxxx> writes: > ====================== > foo =# explain analyse SELECT snapshot ->'company'->'legal_name' FROM > paiyroll_payrun WHERE snapshot ->'employee' ? '2209'; > QUERY PLAN > ----------------------------------------------------------------------------------------------------------------- > Seq Scan on paiyroll_payrun (cost=0.00..29.13 rows=9 width=32) > (actual time=50.185..2520.983 rows=104 loops=1) > Filter: ((snapshot -> 'employee'::text) ? '2209'::text) > Rows Removed by Filter: 835 > Planning Time: 0.075 ms > Execution Time: 2521.004 ms > (5 rows) > ====================== > So, over 2.5 seconds to sequentially scan ~850 rows. Am I right to > presume the INDEX is not used because of the number of rows? Is there > a way to verify that? You could do "set enable_seqscan = off" and see if the EXPLAIN results change. My guess is that you'll find that the indexscan alternative is costed at a bit more than 29.13 units and thus the planner thinks seqscan is cheaper. > And how can I understand the dreadful amount of > time (of course, this is just on my dev machine, but still...)? In the seqscan case, the -> operator is going to retrieve the whole JSONB value from each row, which of course is pretty darn expensive if it's a few megabytes. Unfortunately the planner doesn't account for detoasting costs when making such estimates, so it doesn't realize that the seqscan case is going to be expensive. (Fixing that has been on the to-do list for a long time, but we seldom see cases where it matters this much, so it hasn't gotten done.) The problem would likely go away by itself if your table had more than a few hundred rows, but if you don't anticipate that happening then you need some sort of band-aid. I don't recommend turning enable_seqscan off as a production fix; it'd likely have negative effects on other queries. Personally I'd experiment with reducing random_page_cost a bit to see if I could encourage use of the index that way. The default value of 4.0 is tuned for spinning-rust storage and is not too appropriate for a lot of modern hardware, so there's probably room to fix it that way without detuning your setup for other queries. You should probably also rethink whether you really want to store your data in this format, because anything at all that you do with that big JSONB column is going to be expensive. (Another thing that's been on the to-do list for awhile is enabling partial retrieval of large JSONB values, but AFAIK that hasn't happened yet either.) regards, tom lane