Search Postgresql Archives

Re: newsfeed type query

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 29.4.2015 18:54, Jonathan Vanasco wrote:
> 
> On Apr 29, 2015, at 12:25 PM, Ladislav Lenart wrote:
> 
>> Could you please explain to me the error(s) in my reasoning?
> 
> Let me just flip your list in reverse... and add in some elements (marked with a *):
> 
> posting  ts  context
> p60      60  friend
> p55      55 friend*
> p54      54 friend*
> p50      50  group
> p50      49  group*
> p50      49  group*
> p40      40  friend
> p30      30  group
> p20      20  friend
> p10      10  group
> p0        0  friend
> 
> With the 2 limited subqueries, the results would be:
> 	60F, 55F, 50G, 49G
> 
> But the "most recent" data is
> 	50F, 55F, 54F, 50G
> 
> So we end up showing 49 which is less relevant than 54.

I would expect the overall query to return only 60F nad 55F as the most recent
data. No? You expect it to return 4 items when the LIMIT is only 2. Remember
that the overall query should be also ordered by ts and limited to 2.

I thought you want most recent items across all contexts and not 2 most recent
items from friends plus two most recent items from groups...


Ladislav Lenart


> In some situations this isn't much of an issue, but in others it is detrimental. 
> For example, one of my "feeds" contains a distribution of events according-to-type that is very uneven.  While "friend" and "group" might be relatively close in time to one another, "system" or other events may be months old -- and that older content gets pulled in with this style of query.  
> 
> If you need to paginate the data and select the next 10 overall items, it gets even more complicated.
> 
> IIRC, the best mix of performance and "product" that I've found is do something like this:
> 
> 	SELECT * FROM (
> 		SELECT a,b,c FROM table_a ORDER BY TIMESTAMP DESC LIMIT 10000;
> 		UNION
> 		SELECT a,b,c FROM table_b ORDER BY TIMESTAMP DESC LIMIT 10000;
> 	) as unioned
>  	order by unioned TIMESTAMP DESC LIMIT 100 OFFSET 0; 
> 
> by creating an artificial limit on the inner queries, you can save postgres from doing a lot of intensive i/o work and memory usage (like a lot)
> then, joining a few lists and sorting 20k (or even 100k) items is really cheap.
> the downside is that you effectively limit the 'relevancy' of the query to whatever the inner limit is (ie, 10000 -- not the combined total of 20000), but that number can be arbitrarily high enough that it is irrelevant while still showing the right amount of content for people.




-- 
Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general




[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [Postgresql Jobs]     [Postgresql Admin]     [Postgresql Performance]     [Linux Clusters]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Books]     [PHP Databases]     [Postgresql & PHP]     [Yosemite]
  Powered by Linux