Search Postgresql Archives

SELECT DISTINCT

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hello Guys,

During my work, I have seen a common practice of using DISTINCT . Some will argue that developer should know the effect of using it, but keep in mind not all developers are gurus in RDBMs. Normally, developers work in a narrow domain. Using DISTINCT might lead to a huge performance  degradation because of sort and filter or hashaggregate operations. I think also the rules in determining if the distinct is requiered or not  is moderate in complexity.

Example: Please see how much extra cost we have for 119 record

EXPLAIN ANALYZE SELECT  DISTINCT * FROM pg_aggregate;

"HashAggregate  (cost=3.98..5.17 rows=119 width=28) (actual time=0.525..0.743 rows=119 loops=1)"
"  ->  Seq Scan on pg_aggregate  (cost=0.00..2.19 rows=119 width=28) (actual time=0.011..0.195 rows=119 loops=1)"
"Total runtime: 1.008 ms"

I think any query that returns a unique column (primary key, unique) which is not duplicated in some way (join) can use this optimisation technique.

EXAMPLE:

TABLE A (a1 (uinque), a2, ... , an)

SELECT DISTINCT a1, subset of (a2...an)  FROM A;   -- will return always a distinct  result.

When it comes to joins and nested queries , I do not have clear idea how this  can be implemented. But I could do some search.

Regards

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [Postgresql Jobs]     [Postgresql Admin]     [Postgresql Performance]     [Linux Clusters]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Books]     [PHP Databases]     [Postgresql & PHP]     [Yosemite]
  Powered by Linux