Search Postgresql Archives

DISTINCT *and* ORDER BY in aggregate functions on expressions(!)

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi,

I’ve got a… rather large query (see below), in which I join a complex
data structure (whose exact contents do not, at this point, matter)
together to get some auxiliary data to expose as JSON field.

In this query I can use, for example…

jsonb_build_object('user_permissions',
    jsonb_agg(DISTINCT ap.name ORDER BY ap.name))

… to get a distinct, sorted, list of “user permissions” from a table
ap which I joined to the user table which is the main subject of the
query. (For some reason I need to add DISTINCT because else duplica‐
tes are shown.)

Wrapping this as…

jsonb_build_object('user_permissions',
    COALESCE(
	jsonb_agg(DISTINCT ap.name ORDER BY ap.name)
    FILTER (WHERE ap.id IS NOT NULL)))

… gets me the JSON object’s value for the user_permissions set to
null if there’s nothing in the m:n intermediate table for the user
in question.

This works well. However, what I seem to be not allowed to do is
(without the extra COALESCE, to simplify):

jsonb_build_object('opening_times',
    jsonb_agg(DISTINCT jsonb_build_object(
	'weekday', cot.weekday,
	'from_hour', cot.from_hour,
	'to_hour', cot.to_hour)
    ORDER BY cot.weekday, cot.from_hour, cot.to_hour)
)

This is because, when I use DISTINCT (but only then‽), the ORDER BY
arguments must be… arguments to the function, or something. In the
above case, the sort key is ap.name which is also the argument to
the jsonb_agg function, so no problem there, but here, the jsonb_agg
argument is the return value of a function so… it has no name.

What I’m looking for is something like…

jsonb_build_object('opening_times',
    jsonb_agg(DISTINCT jsonb_build_object(
	'weekday', cot.weekday,
	'from_hour', cot.from_hour,
	'to_hour', cot.to_hour) AS jbo
    ORDER BY jbo->>'weekday', jbo->>'from_hour', jbo->>'to_hour')
)

… except I cannot define aliases in that place. Any other syntax
would also work.

The suggested solution for this is apparently to do…

CREATE OR REPLACE FUNCTION core_openingtime_jsonb_sort(JSONB)
    RETURNS JSONB AS $$
SELECT jsonb_agg(e ORDER BY e->>'weekday', e->>'from_hour', e->>'to_hour')
    FROM jsonb_array_elements($1) AS e
$$ LANGUAGE SQL IMMUTABLE;

… and then query as…

jsonb_build_object('opening_times',
    core_openingtime_jsonb_sort(jsonb_agg(DISTINCT jsonb_build_object(
	'weekday', cot.weekday,
	'from_hour', cot.from_hour,
	'to_hour', cot.to_hour)))
)

… which involves internally subquerying for each output row (i.e.
row of the user table) times amount of sub-JSONArrays that need
to be sorted like this, which is currently 3.

All other solutions I can find involve subqueries in the first
place; I am somewhat proud I even managed to write this with
JOINs and without any subqueries in the first place so I’m
hesitant to go that route.

Any advice here?

And, somewhat related: when outputting JSONB, the order of
JSONObject elements is indeterminate, which is… okay, but
forcing sorted (ASCIIbetically, i.e. by codepoint order)
keys would be very very welcome, for reproducibility of
the output. (I’m sure the reproducible-builds project would
also love if this could be changed, or at least added, in a
way it can be enabled for queries, as session parameter
perhaps?)

The query in its current incarnanation is as follows:

-- -----BEGIN SQL-----
CREATE OR REPLACE FUNCTION core_openingtime_jsonb_sort(JSONB)
    RETURNS JSONB AS $$
SELECT jsonb_agg(e ORDER BY e->>'weekday', e->>'from_hour', e->>'to_hour')
    FROM jsonb_array_elements($1) AS e
$$ LANGUAGE SQL IMMUTABLE;

CREATE OR REPLACE FUNCTION core_generalworkavailability_jsonb_sort(JSONB)
    RETURNS JSONB AS $$
SELECT jsonb_agg(e ORDER BY e->>'weekday', e->>'forenoon', e->>'afternoon', e->>'evening')
    FROM jsonb_array_elements($1) AS e
$$ LANGUAGE SQL IMMUTABLE;

COPY (SELECT cp.email, cp.first_name, cp.last_name, cp.street, cp.number,
	jsonb_build_object('groups', COALESCE(jsonb_agg(DISTINCT ag.name ORDER BY ag.name)
	    FILTER (WHERE ag.id IS NOT NULL))) ||
	jsonb_build_object('help_operations', COALESCE(jsonb_agg(DISTINCT cho.name ORDER BY cho.name)
	    FILTER (WHERE cho.id IS NOT NULL))) ||
	jsonb_build_object('emergency_opening_times', COALESCE(
	    core_openingtime_jsonb_sort(jsonb_agg(DISTINCT jsonb_build_object(
		'weekday', ceot.weekday,
		'from_hour', ceot.from_hour,
		'to_hour', ceot.to_hour))
	    FILTER (WHERE ceot.id IS NOT NULL)))) ||
	jsonb_build_object('opening_times', COALESCE(
	    core_openingtime_jsonb_sort(jsonb_agg(DISTINCT jsonb_build_object(
		'weekday', cot.weekday,
		'from_hour', cot.from_hour,
		'to_hour', cot.to_hour))
	    FILTER (WHERE cot.id IS NOT NULL)))) ||
	jsonb_build_object('possible_work_times', COALESCE(
	    core_generalworkavailability_jsonb_sort(jsonb_agg(DISTINCT jsonb_build_object(
		'weekday', cgwa.weekday,
		'forenoon', cgwa.forenoon,
		'afternoon', cgwa.afternoon,
		'evening', cgwa.evening))
	    FILTER (WHERE cgwa.id IS NOT NULL)))) ||
	jsonb_build_object('qualifications',
	 jsonb_build_object('administrative', COALESCE(jsonb_agg(DISTINCT cqa.name ORDER BY cqa.name)
	    FILTER (WHERE cqa.id IS NOT NULL))) ||
	 jsonb_build_object('health', COALESCE(jsonb_agg(DISTINCT cqh.name ORDER BY cqh.name)
	    FILTER (WHERE cqh.id IS NOT NULL))) ||
	 jsonb_build_object('language', COALESCE(jsonb_agg(DISTINCT cqlang.name ORDER BY cqlang.name)
	    FILTER (WHERE cqlang.id IS NOT NULL))) ||
	 jsonb_build_object('license', COALESCE(jsonb_agg(DISTINCT cqlic.name ORDER BY cqlic.name)
	    FILTER (WHERE cqlic.id IS NOT NULL))) ||
	 jsonb_build_object('technical', COALESCE(jsonb_agg(DISTINCT cqt.name ORDER BY cqt.name)
	    FILTER (WHERE cqt.id IS NOT NULL)))) ||
	jsonb_build_object('restrictions', COALESCE(jsonb_agg(DISTINCT cr.name ORDER BY cr.name)
	    FILTER (WHERE cr.id IS NOT NULL))) ||
	jsonb_build_object('user_permissions', COALESCE(jsonb_agg(DISTINCT ap.name ORDER BY ap.name)
	    FILTER (WHERE ap.id IS NOT NULL))) AS "other_data"
    FROM core_person cp
	LEFT JOIN core_person_emergency_opening_times cpeot ON cpeot.person_id=cp.id
	LEFT JOIN core_openingtime ceot ON ceot.id=cpeot.openingtime_id
	LEFT JOIN core_person_groups cpg ON cpg.person_id=cp.id
	LEFT JOIN auth_group ag ON ag.id=cpg.group_id
	LEFT JOIN core_person_help_operations cpho ON cpho.person_id=cp.id
	LEFT JOIN core_helpoperation cho ON cho.id=cpho.helpoperation_id
	LEFT JOIN core_person_opening_times cpot ON cpot.person_id=cp.id
	LEFT JOIN core_openingtime cot ON cot.id=cpot.openingtime_id
	LEFT JOIN core_person_possible_work_times cppwt ON cppwt.person_id=cp.id
	LEFT JOIN core_generalworkavailability cgwa ON cgwa.id=cppwt.generalworkavailability_id
	LEFT JOIN core_person_qualifications_administrative cpqa ON cpqa.person_id=cp.id
	LEFT JOIN core_qualificationadministrative cqa ON cqa.id=cpqa.qualificationadministrative_id
	LEFT JOIN core_person_qualifications_health cpqh ON cpqh.person_id=cp.id
	LEFT JOIN core_qualificationhealth cqh ON cqh.id=cpqh.qualificationhealth_id
	LEFT JOIN core_person_qualifications_language cpqlang ON cpqlang.person_id=cp.id
	LEFT JOIN core_qualificationlanguage cqlang ON cqlang.id=cpqlang.qualificationlanguage_id
	LEFT JOIN core_person_qualifications_license cpqlic ON cpqlic.person_id=cp.id
	LEFT JOIN core_qualificationlicense cqlic ON cqlic.id=cpqlic.qualificationlicense_id
	LEFT JOIN core_person_qualifications_technical cpqt ON cpqt.person_id=cp.id
	LEFT JOIN core_qualificationtechnical cqt ON cqt.id=cpqt.qualificationtechnical_id
	LEFT JOIN core_person_restrictions cpr ON cpr.person_id=cp.id
	LEFT JOIN core_restriction cr ON cr.id=cpr.restriction_id
	LEFT JOIN core_person_user_permissions cpup ON cpup.person_id=cp.id
	LEFT JOIN auth_permission ap ON ap.id=cpup.permission_id
    GROUP BY cp.email, cp.first_name, cp.last_name, cp.street, cp.number
    ORDER BY cp.email
) TO STDOUT WITH (FORMAT csv, HEADER, FORCE_QUOTE *, ENCODING 'UTF-8');

DROP FUNCTION core_openingtime_jsonb_sort(JSONB);
DROP FUNCTION core_generalworkavailability_jsonb_sort(JSONB);
-- -----END SQL-----

Thanks in advance,
//mirabilos
-- 
„Cool, /usr/share/doc/mksh/examples/uhr.gz ist ja ein Grund,
mksh auf jedem System zu installieren.“
	-- XTaran auf der OpenRheinRuhr, ganz begeistert
(EN: “[…]uhr.gz is a reason to install mksh on every system.”)






[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [Postgresql Jobs]     [Postgresql Admin]     [Postgresql Performance]     [Linux Clusters]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Databases]     [Postgresql & PHP]     [Yosemite]

  Powered by Linux