On Tue, Sep 18, 2012 at 1:13 AM, Herouth Maoz <herouth@xxxxxxxxxxxxx> wrote: > I think you hit the nail right on the head when you asked: > > > >> I wonder if they have different encoding/collations. > > [headdesk]Of course. One of the requirements of the upgrade was to change > the database encoding to unicode, because previously it was in an 8-bit > encoding and we couldn't handle international text, which has become an > absolute necessity. So when I restored the database, I took care to create > it in unicode first: > So, I must, at this point, draw the conclusion that string comparison is a > much, much heavier task in utf-8 than it is in an 8-bit encoding - or that > the collation is the problem. > > Running a different test, which involves string comparison, shows a bigger > discrepancy: > > select count( foo ), foo from ( select cast(random() as varchar(14)) > > cast(random() as varchar(14)) as foo > from generate_series (1,100000000)) asdf > group by foo; > > PC: Time: 308152.090 ms > Server: Time: 499631.553 ms I think the one below will show an even larger discrepancy. You are doing 2 casts for each comparison, so I think the casts overhead will dilute out the comparison. select count(distinct foo) from ( select cast(random() as varchar(14)) as foo from generate_series (1,100000000)) asdf; > Finally, I created a test table, as you asked: > > >> create table foo as select msisdn,sme_reference from >> sms.billing__archive limit 1000000; > > Then I created an index on the msisdn and sme_reference columns together. > 99% of the data in the msisdn field consist of 11-digit phone numbers. > Result: > > PC: 5792.641 ms > Server: 23740.470 ms > > Huge discrepancy there. try: create index ON foo (msisdn COLLATE "C", sme_reference) ; This can only be done on 9.1 server, as that feature is new to that release. It should be much faster to create than the index with default collation. (or change the collation of msisdn column definition, rather than just in the index). This assumes you just need the index for equality, not for some precise locale-specific ordering (which for phone numbers seems like a safe bet). Cheers, Jeff -- Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general