Re: Last dbus upgrade issues

Tomasz Kłoczko <kloczko.tomasz@xxxxxxxxx> · Mon, 26 Nov 2018 23:49:18 +0100

On Mon, 26 Nov 2018 at 20:04, Owen Taylor <otaylor@xxxxxxxxxx> wrote:
> On Mon, Nov 26, 2018 at 9:59 AM Tomasz Kłoczko <kloczko.tomasz@xxxxxxxxx> wrote:
> > 1) if dbus service crashes/is not availaible temporary IMO it wold be
> > good to prepare whole desktop apps code to not crash but handle dbus
> > disconnection and maybe display centered message that it is not
> > possible to connect to dbus. Crashing everything looks really *bad*.
>
> The design of D-Bus is that it is the central point of lifecycle
> management for your desktop - when the session bus goes away,
> everything on it should go away. Trying to make things survive bus
> restart compromises that, and makes it more likely you'll get stray
> processes after exit. Now with systemd user sessions, cgroups, etc, we
> have other mechanisms for session cleanup, but removing the older
> concept of exit-with-the-bus would require major rethinking.

Cleaning the session is perfect example of what can be done with using
Solaris contracts which is kind of grouping attribute for some group
of processes which needs to be treated as the group. When session
needs to be closed all processes which are part of the contract needs
to be killed by contract id (which works like task id or pid).
Lack of contacts is IMO another cause current systemd complexity
(Solaris SMF for example kills all service processes by not looking
for all PIDs but simple by killing all processes with the same ctid).
Is it any other propose of the dbus communication? (my question is
more about original initial design than what is now)
I'm asking because I do not fully understand of the cause why dbus has
been designed and to solve current design issue original cause of
erecting/designing such communication needs to be well understood.

> Plus, when the connection to the bus is lost, all assumptions that the
> app has about the state of the system are invalid. (Normally, with
> D-Bus you reliably get messages, or you reliably get notification when
> your communication partner goes away.) So the app state needs to be
> reinitialized from scratch. There's significant code complexity there
> which will be exercised in only very rare circumstance.

Looks like again contracts could solve such scenarios with sending
normal signal to PID 1 (like HUP) with masking to be received by all
processes with the same ctid. No new communication or protocols ..
just minute modifications of well known signal handlers.
https://docs.oracle.com/cd/E18752_01/html/816-5174/contract-4.html#REFMAN4contract-4
This idea already is used on live systems from more than decade.

kloczek
_______________________________________________
devel mailing list -- devel@xxxxxxxxxxxxxxxxxxxxxxx
To unsubscribe send an email to devel-leave@xxxxxxxxxxxxxxxxxxxxxxx
Fedora Code of Conduct: https://getfedora.org/code-of-conduct.html
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: https://lists.fedoraproject.org/archives/list/devel@xxxxxxxxxxxxxxxxxxxxxxx