I found a bug! The impact of it is that the releng primary arch 'compose' fedmsg messages were being considered invalid by our consuming services (ircbot, etc..) but the s390 compose messages were being let through. First -- here's the patch, then the explanation: diff --git a/inventory/host_vars/branched-composer.phx2.fedoraproject.org b/inventory/host_vars/branched-composer.phx2.fedoraproject.org index bfe9b94..a5d3514 100644 --- a/inventory/host_vars/branched-composer.phx2.fedoraproject.org +++ b/inventory/host_vars/branched-composer.phx2.fedoraproject.org @@ -7,12 +7,3 @@ volgroup: /dev/vg_bvirthost08 kojipkgs_url: kojipkgs.fedoraproject.org kojihub_url: koji.fedoraproject.org/kojihub kojihub_scheme: https - -# These are consumed by a task in roles/fedmsg/base/main.yml -fedmsg_certs: -- service: shell - owner: root - group: root -- service: bodhi - owner: root - group: masher diff --git a/inventory/host_vars/rawhide-composer.phx2.fedoraproject.org b/inventory/host_vars/rawhide-composer.phx2.fedoraproject.org index a0d17a6..9cb3409 100644 --- a/inventory/host_vars/rawhide-composer.phx2.fedoraproject.org +++ b/inventory/host_vars/rawhide-composer.phx2.fedoraproject.org @@ -6,12 +6,3 @@ volgroup: /dev/vg_bvirthost06 kojipkgs_url: kojipkgs.fedoraproject.org kojihub_url: koji.fedoraproject.org/kojihub kojihub_scheme: https - -# These are consumed by a task in roles/fedmsg/base/main.yml -fedmsg_certs: -- service: shell - owner: root - group: root -- service: bodhi - owner: root - group: masher It is just *removing* lines from the host_vars files for rawhide-composer and branched-composer. But why? First, those fedmsg_certs vars are already defined at the group_vars level here: https://infrastructure.fedoraproject.org/cgit/ansible.git/tree/inventory/group_vars/composers#n29 That more fully-defined and correct statement at the group level was being overwritten by the less fully defined statements at the host level (see ansible var precedence rules). Everything appeared to be working normally while this was the case because, since no hosts declared that they could send those fedmsg topics there was no explicit check for who could send them. It didn't matter until we added the s390 koji hub a week or so ago which is allowed to broadcast those same topics. It had its fedmsg_certs correctly defined, and since it declared that it could send those topics -- and no other hosts made the same declarations -- the primary compose messages suddenly started being considered invalid (unauthorized). By removing these old crufty definitions at the host level and letting the correct definition at the group level prevail -- all those hosts should show up correctly in the fedmsg authz policy and things should start working again. This will require a master playbook run on just the fedmsgdconfig tag to push out making this a "high touch" change to do during freeze, but I'm quite certain it is correct. Can I get two +1s? -Ralph
Attachment:
signature.asc
Description: PGP signature