Poor performance on seq scan

Laszlo Nagy <gandalf@xxxxxxxxxxxxxxxxxx> · Tue, 12 Sep 2006 11:59:08 +0200

 Hello,

I have a big table called products. Table size: 1123MB. Toast table 
size: 32MB. Indexes size: 380MB.
I try to do a query like this:

select id,name from products where name like '%Mug%';

Yes, I know that tsearch2 is better for this, but please read on. The 
above query gives this plan:

Seq Scan on product  (cost=0.00..153489.52 rows=31390 width=40)
 Filter: (name ~~ '%Mug%'::text)

When I use this with explain analyze:

"Seq Scan on product  (cost=0.00..153489.52 rows=31390 width=40) (actual 
time=878.873..38300.588 rows=72567 loops=1)"
"  Filter: (name ~~ '%Mug%'::text)"
"Total runtime: 38339.026 ms"

Meanwhile, "iostat 5" gives something like this:

tin tout  KB/t tps  MB/s   KB/t tps  MB/s  us ni sy in id
  1   14 128.00   1  0.10  128.00   1  0.10   5  0 94  1  0
  0   12 123.98 104 12.56  123.74 104 12.56   8  0 90  2  0
  0   12 125.66 128 15.75  125.26 128 15.68  10  0 85  6  0
  0   12 124.66 129 15.67  124.39 129 15.64  12  0 85  3  0
  0   12 117.13 121 13.87  117.95 121 13.96  12  0 84  5  0
  0   12 104.84 118 12.05  105.84 118 12.19  10  0 87  2  0

130 transfers per second with 12-15MB/sec transfer speed. (FreeBSD 6.1 
with two STATA150 drives in gmirror RAID1)

I made another test. I create a file with the identifiers and names of 
the products:

psql#\o products.txt
psql#select id,name from product;

Then I can search using grep:

grep "Mug" products.txt | cut -f1 -d\|

There is a huge difference. This command runs within 0.5 seconds. That 
is, at least 76 times faster than the seq scan. It is the same if I 
vacuum, backup and restore the database. I thought that the table is 
stored in one file, and the seq scan will be actually faster than 
grepping the file. Can you please tell me what am I doing wrong? I'm not 
sure if I can increase the performance of a seq scan by adjusting the 
values in postgresql.conf. I do not like the idea of exporting the 
product table periodically into a txt file, and search with grep. :-)

Another question: I have a btree index on product(name). It contains all 
product names and the identifiers of the products. Wouldn't it be easier 
to seq scan the index instead of seq scan the table? The index is only 
66MB, the table is 1123MB.

I'm new to this list and also I just recently started to tune postgresql 
so please forgive me if this is a dumb question.

Regards,

  Laszlo