Search Postgresql Archives

Re: [PERFORM] Partioning tsearch2 a table into chunks and accessing via views

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Since I am using tsearch2 on the table I think there is going to be a significant performance hit - e.g., I partition by batch (batches are not separated by date, they are essentially random subsets of a much larger data-set). I am querying this database using tsearch2, so I am assuming all tables are going to be queried each time since the text is not partition by any specific constraint - e.g., >R goes to table 1 and <=R goes to table 2.

Benjamin

On Aug 25, 2007, at 11:18 AM, Joshua D. Drake wrote:

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Benjamin Arai wrote:
As stated in the previous email if I use partitioning then queries will
be executed sequentially - i.e., instead of log(n) it would be (#
partitions) * log(n).  Right?

The planner will consider every relevant partition during the execution.
Which may be a performance hit, it may not be. It depends on many
factors. In general however, partitioning when done correctly is a
performance benefit and a maintenance benefit.

Sincerely,

Joshua D. Drake



Benjamin

On Aug 25, 2007, at 9:18 AM, Joshua D. Drake wrote:

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Benjamin Arai wrote:
This kind of disappointing, I was hoping there was more that could be
done.

There has to be another way to do incremental indexing without loosing
that much performance.

What makes you think you are loosing performance by using partitioning?

Joshua D. Drake


Benjamin

On Aug 24, 2007, at 6:29 PM, Joshua D. Drake wrote:

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Brandon Shalton wrote:
Benjamin,



In order to avoid the re-indexing I was thinking of instead creating
a new
table each month (building its indexes and etc) and accessing them
all
through a view. This way I only have to index the new data each
month.


Take a look at bizgres.org (based on postgres).

They have a parent-child structure.

The way i use it, is I have about 30M records a day that are inserted
into the database.

Each day is a "child" table to the "parent".

so example:

the parent table is called  "logfile"

each day, is a child, with the structure like  "logfile_YYMMDD"

the "child" inherits the table structure of the parent, such that you
could query the child table name directly, or you run the query
against
the parent (ie. logfile table) and get all the data.

the indexes are done on a per table basis, so new data that comes
in, is
a lesser amount, and doesn't require re-indexing.


PostgreSQL can do all of this too.

Sincerely,

Joshua D. Drake



example:

select * from logfile_070825 where datafield = 'foo'

if i knew i wanted to specifically go into that child, or:

select * from logfile where datafield = 'foo'

and all child tables are searched and results merged. You can perform
any kind of sql query and field structures are you normally do.

the downside is that the queries are run sequentially.

so if you had 100 child tables, each table is queried via indexes,
then
results are merged.

but, this approach does allow me to dump alot of data in, without
having
the re-indexing issues you are facing.

at some point, you could roll up the days, in to weekly child tables,
then monthly tables, etc.

I believe Bizgres has a new version of their system that does parallel
queries which would certainly speed things up.

For your documents, you can do it by the day it was checked in, or
maybe
you have some other way of logically grouping, but the parent/ child
table structure really helped to solve my problem of adding in
millions
of records each day.

The closest thing in mysql is using merge tables, which is not really
practical when it comes time to do the joins to the tables.

-brandon

http://www.t3report.com - marketing intelligence for online marketing
and affiliate programs










---------------------------(end of
broadcast)---------------------------
TIP 6: explain analyze is your friend



- --

      === The PostgreSQL Company: Command Prompt, Inc. ===
Sales/Support: +1.503.667.4564   24x7/Emergency: +1.800.492.2240
PostgreSQL solutions since 1997  http://www.commandprompt.com/
            UNIQUE NOT NULL
Donate to the PostgreSQL Project:
http://www.postgresql.org/about/donate
PostgreSQL Replication: http://www.commandprompt.com/products/




- ---------------------------(end of
broadcast)---------------------------
TIP 6: explain analyze is your friend



- --

      === The PostgreSQL Company: Command Prompt, Inc. ===
Sales/Support: +1.503.667.4564   24x7/Emergency: +1.800.492.2240
PostgreSQL solutions since 1997  http://www.commandprompt.com/
            UNIQUE NOT NULL
Donate to the PostgreSQL Project: http://www.postgresql.org/about/ donate
PostgreSQL Replication: http://www.commandprompt.com/products/




- --

      === The PostgreSQL Company: Command Prompt, Inc. ===
Sales/Support: +1.503.667.4564   24x7/Emergency: +1.800.492.2240
PostgreSQL solutions since 1997  http://www.commandprompt.com/
			UNIQUE NOT NULL
Donate to the PostgreSQL Project: http://www.postgresql.org/about/ donate
PostgreSQL Replication: http://www.commandprompt.com/products/

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.6 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFG0HJ5ATb/zqfZUUQRAuEdAJwNwsr/XCsr85tElSVbRVMUHME+PACglbJK
gj5cZgOtgEEjUPph0jpsOcw=
=u7Ox
-----END PGP SIGNATURE-----


-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.5 (Darwin)

iQIVAwUBRtCHvPyqRf6YpodNAQLDuBAAp+dg1MHZy+hjZ8Zk9OQTbeADgJqGPpi9
G7+y3iyaaqOF66TC52P7OaqO6nPhoNCNMCxwztnASyxpftGD5yJ4AZTSGcbAsWB9
cO5mE1mgbngZNPnRLypeJ81hyE6bniNNL7xXSq9LB8wRMczFZwGVZT66+lMIFjvv
0OrbAcSNUFqddky7EFm8gx6A2FNIzSdFB0dNbzpKwEOTnCHKvh+O99sAr/LB7mmL
Hj/wzeQKrWbDAB3+N9rczivZ03DvYAGbUY9qBfNj7Y9YL3iu/Q+Oy4bHtI6d/a7B
wepol2xe1sYEtQ+R3yMPXFte0483n8XIdXxa412ZSIEBfLxHzV6M7JTbPtgWwE+9
7xvyYbO7xQL9N/P8ZGg75eEqXtUrepGmJG0Y30qF5sNdMG0pWoz1bzDjSLNCnylq
JwsO8p1EHNPnPRqotwZZSfLUW16eREqLaOrSC84gIw5Q6zAMZe/k2ckzzHKPGB1c
sckaQROcgK4Lu9ywjRjBjNqclOMasf0MCrsDVMQE/wnh4GoDL/PAyEOqnlpvJ+cx
k4kmOrEz5GRZQehHUI7CdejFwZ32sAB+nV2r8zDW9FSxgoRoFvtE2hooJ9orv0IU
1F8TeBdifVP/Ef/lHAHs6IqEH45y72WqrWFZsIdU1PDe0MyfgMaOBwdwXNeZqky/
IF5SMKbl9yA=
=F9Oq
-----END PGP SIGNATURE-----

---------------------------(end of broadcast)---------------------------
TIP 6: explain analyze is your friend

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [Postgresql Jobs]     [Postgresql Admin]     [Postgresql Performance]     [Linux Clusters]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Books]     [PHP Databases]     [Postgresql & PHP]     [Yosemite]
  Powered by Linux