Search Postgresql Archives

Re: bytea encode performance issues

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



The LIKE operator is likely the problem, but it is a critical part of an
email application. Searches are done by, "Show me all emails containing
the following word."

I've tried using TSearch2's full text index. It made the query 50%
faster, taking 5 minutes. This is still not even close to the less then
a minute in MySQL.

> 
> That LIKE operator is probably your problem. An unbounded LIKE like that
> (with a wildcard on both sides) means no index can be used, hence you
> get a sequential scan.
> 
> There are apparently some possibilities with the new GIN indexes (or
> maybe even using GIST), but I haven't had an opportunity to try those
> yet. There were some messages about just that on this list recently.
> 

I don't think a functional index (or anything other then a FTI) would
accomplish anything, being that I am doing unbounded Likes.

> If you create an index, make sure you create a _functional_ index over
> ENCODE(messageblk, 'escape').
> 

Email is binary when it contains attachments. I actually planned on
using an ASCII encoding, but the dbmail people specifically said not to.
I don't know if they were speaking from experience, or because ASCII
sounds bad.

> Since when is e-mail binary data? I don't quite see why you'd use a
> bytea field instead of text. If your problem is character encoding, then
> just don't store that ("encode" the DB using SQLASCII).
> 

As I mentioned, this is the system that came with dbmail. It runs on
both PostGresql and MySQL, so they may have done some compatibility
things. There are 4 statuses possible, 0,1,2,3 if you use the database
through the software then a 99 could never appear there.

> 
> 
> I notice some other oddities in that query/design. Why is is_header a
> smallint instead of a boolean? I'm assuming this is for compatibility
> with an other database?
> 
> Why use status IN (0, 1) instead of more descriptive keys? Is it even
> constrained this way, or could arbitrary numbers end up as status (say
> 99) and if so, what happens to those messages?
> 
> Alban Hertroys
> 
> -- 
> If you can't see the forest for the trees,
> cut the trees and you'll see there is no forest.
> 
> 
> !DSPAM:824,48958f30243481673380013!
> 
> 


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [Postgresql Jobs]     [Postgresql Admin]     [Postgresql Performance]     [Linux Clusters]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Books]     [PHP Databases]     [Postgresql & PHP]     [Yosemite]
  Powered by Linux