Re: Regex performance issue

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hello..

Thanks for the tip , i think i have got the ideea..

I'm too tired too , and i will try it tommorow.


Anyway , anybody has a clue why this regex is that CPU intensive ? I did not saw the light on my drives blinking , and also vmstat doesn't yeld any blocks in or out...
And how can it be optimized ?

Is there a way to trace the system calls ?
strace doesn't give me anything else but some lseeks and reads...


PS: Tried it with a 8.2 snaphsot and the result is the same..

Regards
   Alex
----- Original Message ----- From: "Heikki Linnakangas" <heikki@xxxxxxxxxxxxxxxx>
To: "Alexandru Coseru" <alexandru.coseru@xxxxxxxxxxxxxxx>
Cc: <pgsql-performance@xxxxxxxxxxxxxx>
Sent: Sunday, December 03, 2006 12:35 AM
Subject: Re: [PERFORM] Regex performance issue


Alexandru Coseru wrote:
Hello..

I cannot use the first advice , because i'm not aware of the prefix length in the database...
This is why i'm ordering after length(prefix)..


On the 2nd one , i'm not sure that i can follow you..

Ok, let me try again :)

asterisk=> select * from destlist LIMIT 10;
id | id_ent | dir |   prefix   |   country   |      network       | tip
----+--------+-----+------------+-------------+--------------------+-----
 1 |     -1 |   0 | (^0093)    | AFGHANISTAN | AFGHANISTAN        |   6
 2 |     -1 |   0 | (^00937)   | AFGHANISTAN | AFGHANISTAN Mobile |   5
 3 |     -1 |   0 | (^00355)   | ALBANIA     | ALBANIA            |   6
 4 |     -1 |   0 | (^0035538) | ALBANIA     | ALBANIA Mobile     |   5
 5 |     -1 |   0 | (^0035568) | ALBANIA     | ALBANIA Mobile     |   5
 6 |     -1 |   0 | (^0035569) | ALBANIA     | ALBANIA Mobile     |   5
 7 |     -1 |   0 | (^00213)   | ALGERIA     | ALGERIA            |   6
 8 |     -1 |   0 | (^0021361) | ALGERIA     | ALGERIA Mobile     |   5
 9 |     -1 |   0 | (^0021362) | ALGERIA     | ALGERIA Mobile     |   5
10 |     -1 |   0 | (^0021363) | ALGERIA     | ALGERIA Mobile     |   5

Store the prefix in a character column, without the regex stuff. Like below. I've removed the columns that are not relevant, in fact it would make sense to store them in another table, and have just the id and prefix in this table.

id | prefix  | network
---+---------+--------------------
 1 | 0093    | AFGHANISTAN
 2 | 00937   | AFGHANISTAN Mobile
 3 | 00355   | ALBANIA
 4 | 0035538 | ALBANIA Mobile
 5 | 0035568 | ALBANIA Mobile
 6 | 0035569 | ALBANIA Mobile
 7 | 00213   | ALGERIA
 8 | 0021361 | ALGERIA Mobile
 9 | 0021362 | ALGERIA Mobile
10 | 0021363 | ALGERIA Mobile

Now, add the rows marked with start below:

 id | prefix  | network
----+---------+--------------------
  1 | 0093    | AFGHANISTAN
  2 | 00937   | AFGHANISTAN Mobile
* 1 | 00938   | AFGHANISTAN
  3 | 00355   | ALBANIA
  4 | 0035538 | ALBANIA Mobile
* 3 | 0035539 | ALBANIA
  5 | 0035568 | ALBANIA Mobile
  6 | 0035569 | ALBANIA Mobile
* 3 | 003557  | ALBANIA
  7 | 00213   | ALGERIA
  8 | 0021361 | ALGERIA Mobile
  9 | 0021362 | ALGERIA Mobile
 10 | 0021363 | ALGERIA Mobile
* 7 | 0021364 | ALGERIA

The added rows make it unnecessary to use regex for the searches. You can use just the >= operator like this: (using the example number you gave)

SELECT id FROM destlist WHERE '00213618833' >= prefix ORDER BY prefix LIMIT 1

Which would return id 7. As another example, a query of "00213654321" would match the last row, which again has id 7.

I'm too tired to figure out the exact algorithm for adding the rows, but I'm pretty sure it can be automated... The basic idea is that when there's a row with id A and prefix XXXX, and another row with id B and prefix XXXXY, we add another row with id A and prefix XXXX(Y+1).

--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com



--
No virus found in this incoming message.
Checked by AVG Free Edition.
Version: 7.1.409 / Virus Database: 268.15.4/563 - Release Date: 12/2/2006




[Postgresql General]     [Postgresql PHP]     [PHP Users]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Books]     [PHP Databases]     [Yosemite]

  Powered by Linux