Paul, I'm sure I'm missing something but it seems like your approach will not work. It's because the LEFT OUTER JOIN is on the numeric day of the week. So if you had this query going over weeks or months of data wouldn't you have the same issue with the days that had no new users not being factored into the AVG? I ended up doing something like this, which seems to work pretty well.
WITH usersByDay AS (SELECT cDate, COUNT(*) AS totalFROM (SELECT generate_series({CALENDAR_INTERVAL.START}::DATE,{CALENDAR_INTERVAL.END}::DATE,interval '1 day')::DATE AS cDate) AS cLEFT OUTER JOIN users u ON u.created::DATE = c.cDateGROUP BY cDate),avgUsersByDOW AS (SELECT extract('dow' FROM cDate) AS nDay,to_char(cDate,'Dy') AS "Day",ROUND(AVG(total), 2) AS "New Users"FROM usersByDayGROUP BY 1, 2ORDER BY 1)SELECT "Day", "New Users" FROM avgUsersByDOW ORDER BY nDay
On Mon, Jul 6, 2015 at 11:30 AM, Paul Jungwirth <pj@xxxxxxxxxxxxxxxxxxxxxxxx> wrote:
Thanks Paul, I guess I'm not sure how a generate_series between 0 to 6
would solve this problem. Wouldn't I have to generate a series based on
the date range (by day) and then group by DOW _after_ that? Can you give
me an example of how I'd do it with a series based on 0 to 6?
Looks like David Johnston beat me to it! :-) But this is what I had in mind:
SELECT s.d AS dow,
COUNT(u.id) c
FROM generate_series(0, 6) s(d)
LEFT OUTER JOIN users u
ON EXTRACT(dow FROM created) = s.d
GROUP BY dow
ORDER BY dow
;
You can also get human-readable DOW names by creating a 7-row CTE table and joining to it based on the numeric dow.
Paul