From: Daniel Mack <daniel@xxxxxxxxxx> kdbus is a system for low-latency, low-overhead, easy to use interprocess communication (IPC). The interface to all functions in this driver is implemented through ioctls on /dev nodes. This patch adds detailed documentation about the kernel level API design. Signed-off-by: Daniel Mack <daniel@xxxxxxxxxx> Signed-off-by: Greg Kroah-Hartman <gregkh@xxxxxxxxxxxxxxxxxxx> --- Documentation/kdbus.txt | 1815 +++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 1815 insertions(+) create mode 100644 Documentation/kdbus.txt diff --git a/Documentation/kdbus.txt b/Documentation/kdbus.txt new file mode 100644 index 000000000000..ac1a18908976 --- /dev/null +++ b/Documentation/kdbus.txt @@ -0,0 +1,1815 @@ +D-Bus is a system for powerful, easy to use interprocess communication (IPC). + +The focus of this document is an overview of the low-level, native kernel D-Bus +transport called kdbus. Kdbus in the kernel acts similar to a device driver, +all communication between processes take place over special character device +nodes in /dev/kdbus/. + +For the general D-Bus protocol specification, the payload format, the +marshaling, and the communication semantics, please refer to: + http://dbus.freedesktop.org/doc/dbus-specification.html + +For a kdbus specific userspace library implementation please refer to: + http://cgit.freedesktop.org/systemd/systemd/tree/src/systemd/sd-bus.h + +Articles about D-Bus and kdbus: + http://lwn.net/Articles/580194/ + + +1. Terminology +=============================================================================== + + Domain: + A domain is a named object containing a number of buses. A system + container that contains its own init system and users usually also + runs in its own kdbus domain. The /dev/kdbus/domain/<container-name>/ + directory shows up inside the domain as /dev/kdbus/. Every domain offers + its own "control" device node to create new buses or new sub-domains. + Domains have no connection to each other and cannot see nor talk to + each other. See section 5 for more details. + + Bus: + A bus is a named object inside a domain. Clients exchange messages + over a bus. Multiple buses themselves have no connection to each other; + messages can only be exchanged on the same bus. The default entry point to + a bus, where clients establish the connection to, is the "bus" device node + /dev/kdbus/<bus name>/bus. + Common operating system setups create one "system bus" per system, and one + "user bus" for every logged-in user. Applications or services may create + their own private named buses. See section 5 for more details. + + Endpoint: + An endpoint provides the device node to talk to a bus. Opening an + endpoint creates a new connection to the bus to which the endpoint belongs. + Every bus has a default endpoint called "bus". + A bus can optionally offer additional endpoints with custom names to + provide a restricted access to the same bus. Custom endpoints carry + additional policy which can be used to give sandboxed processes only + a locked-down, limited, filtered access to the same bus. + See section 5 for more details. + + Connection: + A connection to a bus is created by opening an endpoint device node of + a bus and becoming an active client with the HELLO exchange. Every + connected client connection has a unique identifier on the bus and can + address messages to every other connection on the same bus by using + the peer's connection id as the destination. + See section 6 for more details. + + Pool: + Each connection allocates a piece of shmem-backed memory that is used + to receive messages and answers to ioctl command from the kernel. It is + never used to send anything to the kernel. In order to access that memory, + userspace must mmap() it into its task. + See section 12 for more details. + + Well-known Name: + A connection can, in addition to its implicit unique connection id, request + the ownership of a textual well-known name. Well-known names are noted in + reverse-domain notation, such as com.example.service1. Connections offering + a service on a bus are usually reached by its well-known name. The analogy + of connection id and well-known name is an IP address and a DNS name + associated with that address. + + Message: + Connections can exchange messages with other connections by addressing + the peers with their connection id or well-known name. A message consists + of a message header with kernel-specific information on how to route the + message, and the message payload, which is a logical byte stream of + arbitrary size. Messages can carry additional file descriptors to be passed + from one connection to another. Every connection can specify which set of + metadata the kernel should attach to the message when it is delivered + to the receiving connection. Metadata contains information like: system + timestamps, uid, gid, tid, proc-starttime, well-known-names, process comm, + process exe, process argv, cgroup, capabilities, seclabel, audit session, + loginuid and the connection's human-readable name. + See section 7 and 13 for more details. + + Item: + The API of kdbus implements a notion of items, submitted through and + returned by most ioctls, and stored inside data structures in the + connection's pool. See section 4 for more details. + + Broadcast and Match: + Broadcast messages are potentially sent to all connections of a bus. By + default, the connections will not actually receive any of the sent + broadcast messages; only after installing a match for specific message + properties, a broadcast message passes this filter. + See section 10 for more details. + + Policy: + A policy is a set of rules that define which connections can see, talk to, + or register a well-know name on the bus. A policy is attached to buses and + custom endpoints, and modified by policy holder connection or owners of + custom endpoints. See section 11 for more details. + + Access rules to allow who can see a name on the bus are only checked on + custom endpoints. Policies may be defined with names that end with '.*'. + When matching a well-known name against such a wildcard entry, the last + part of the name is ignored and checked against the wildcard name without + the trailing '.*'. See section 11 for more details. + + Privileged bus users: + A user connecting to the bus is considered privileged if it is either the + creator of the bus, or if it has the CAP_IPC_OWNER capability flag set. + + +2. Device Node Layout +=============================================================================== + +The kdbus interface is exposed through device nodes in /dev. + + /sys/bus/kdbus + `-- devices + |-- kdbus!0-system!bus -> ../../../devices/virtual/kdbus/kdbus!0-system!bus + |-- kdbus!2702-user!bus -> ../../../devices/virtual/kdbus/kdbus!2702-user!bus + |-- kdbus!2702-user!ep.app -> ../../../devices/virtual/kdbus/kdbus!2702-user!ep.app + `-- kdbus!control -> ../../../devices/kdbus!control + + /dev/kdbus + |-- control + |-- 0-system + | |-- bus + | `-- ep.apache + |-- 1000-user + | `-- bus + |-- 2702-user + | |-- bus + | `-- ep.app + `-- domain + |-- fedoracontainer + | |-- control + | |-- 0-system + | | `-- bus + | `-- 1000-user + | `-- bus + `-- mydebiancontainer + |-- control + `-- 0-system + `-- bus + +Note: + The device node subdirectory layout is arranged that a future version of + kdbus could be implemented as a file system with a separate instance mounted + for each domain. For any future changes, this always needs to be kept + in mind. Also the dependency on udev's userspace hookups or sysfs attribute + use should be limited to the absolute minimum for the same reason. + + +3. Data Structures and flags +=============================================================================== + +3.1 Data structures and interconnections +---------------------------------------- + + +-------------------------------------------------------------------------+ + | Domain (Init Domain) | + | /dev/kdbus/control | + | +---------------------------------------------------------------------+ | + | | Bus (System Bus) | | + | | /dev/kdbus/0-system/ | | + | | +-------------------------------+ +-------------------------------+ | | + | | | Endpoint | | Endpoint | | | + | | | /dev/kdbus/0-system/bus | | /dev/kdbus/0-system/ep.app | | | + | | +-------------------------------+ +-------------------------------+ | | + | | +--------------+ +--------------+ +--------------+ +--------------+ | | + | | | Connection | | Connection | | Connection | | Connection | | | + | | | :1.22 | | :1.25 | | :1.55 | | :1.81 | | | + | | +--------------+ +--------------+ +--------------+ +--------------+ | | + | +---------------------------------------------------------------------+ | + | | + | +---------------------------------------------------------------------+ | + | | Bus (User Bus for UID 2702) | | + | | /dev/kdbus/2702-user/ | | + | | +-------------------------------+ +-------------------------------+ | | + | | | Endpoint | | Endpoint | | | + | | | /dev/kdbus/2702-user/bus | | /dev/kdbus/2702-user/ep.app | | | + | | +-------------------------------+ +-------------------------------+ | | + | | +--------------+ +--------------+ +--------------+ +--------------+ | | + | | | Connection | | Connection | | Connection | | Connection | | | + | | | :1.22 | | :1.25 | | :1.55 | | :1.81 | | | + | | +--------------+ +--------------+ +-------------------------------+ | | + | +---------------------------------------------------------------------+ | + | | + | +---------------------------------------------------------------------+ | + | | Domain (Container; inside it, fedoracontainer/ becomes /dev/kdbus/) | | + | | /dev/kdbus/domain/fedoracontainer/control | | + | | +-----------------------------------------------------------------+ | | + | | | Bus (System Bus of "fedoracontainer") | | | + | | | /dev/kdbus/domain/fedoracontainer/0-system/ | | | + | | | +-----------------------------+ | | | + | | | | Endpoint | | | | + | | | | /dev/.../0-system/bus | | | | + | | | +-----------------------------+ | | | + | | | +-------------+ +-------------+ | | | + | | | | Connection | | Connection | | | | + | | | | :1.22 | | :1.25 | | | | + | | | +-------------+ +-------------+ | | | + | | +-----------------------------------------------------------------+ | | + | | | | + | | +-----------------------------------------------------------------+ | | + | | | Bus (User Bus for UID 270 of "fedoracontainer") | | | + | | | /dev/kdbus/domain/fedoracontainer/2702-user/ | | | + | | | +-----------------------------+ | | | + | | | | Endpoint | | | | + | | | | /dev/.../2702-user/bus | | | | + | | | +-----------------------------+ | | | + | | | +-------------+ +-------------+ | | | + | | | | Connection | | Connection | | | | + | | | | :1.22 | | :1.25 | | | | + | | | +-------------+ +-------------+ | | | + | | +-----------------------------------------------------------------+ | | + | +---------------------------------------------------------------------+ | + +-------------------------------------------------------------------------+ + +The above description uses the D-Bus notation of unique connection names that +adds a ":1." prefix to the connection's unique ID. kbus itself doesn't +use that notation, neither internally nor externally. However, libraries and +other usespace code that aims for compatibility to D-Bus might. + +3.2 Flags +--------- + +All ioctls used in the communication with the driver contain two 64-bit fields, +'flags' and 'kernel_flags'. In 'flags', the behavior of the command can be +tweaked, whereas in 'kernel_flags', the kernel driver writes back the mask of +supported bits upon each call, and sets the KDBUS_FLAGS_KERNEL bit. This is a +way to probe possible kernel features and make code forward and backward +compatible. + +All bits that are not recognized by the kernel in 'flags' are rejected, and the +ioctl fails with -EINVAL. + + +4. Items +=============================================================================== + +To flexibly augment transport structures used by kdbus, data blobs of type +struct kdbus_item are used. An item has a fixed-sized header that only stores +the type of the item and the overall size. The total size is variable and is +in some cases defined by the item type, in other cases, they can be of +arbitrary length (for instance, a string). + +In the external kernel API, items are used for many ioctls to transport +optional information from userspace to kernelspace. They are also used for +information stored in a connection's pool, such as messages, name lists or +requested connection information. + +In all such occasions where items are used as part of the kdbus kernel API, +they are embedded in structs that have an overall size of their own, so there +can be many of them. + +The kernel expects all items to be aligned to 8-byte boundaries. + +A simple iterator in userspace would iterate over the items until the items +have reached the embedding structure's overall size. An example implementation +of such an iterator can be found in tools/testing/selftests/kdbus/kdbus-util.h. + + +5. Creation of new domains, buses and endpoints +=============================================================================== + +The initial kdbus domain is unconditionally created by the kernel module. A +domain contains a "control" device node which allows to create a new bus or +domain. New domains do not have any buses created by default. + + +5.1 Domains and buses +--------------------- + +Opening the control device node returns a file descriptor, it accepts the +ioctls KDBUS_CMD_BUS_MAKE and KDBUS_CMD_DOMAIN_MAKE which specify the name of +the new bus or domain to create. The control file descriptor needs to be kept +open for the entire life-time of the created bus or domain, closing it will +immediately cleanup the entire bus or domain and all its associated +resources and connections. Every control file descriptor can only be used once +to create a new bus or domain; from that point, it is not used for any +further communication until the final close(). + +Each bus will generate a random, 128-bit UUID upon creation. It will be +returned to the creators of connections through kdbus_cmd_hello.id128 and can +be used by userspace to uniquely identify buses, even across different machines +or containers. The UUID will have its its variant bits set to 'DCE', and denote +version 4 (random). + +When a new domain is created, its structure in /dev/kdbus/<name>/ is a +replication of what's initially created in /dev/kdbus. In fact, internally, +a dummy default domain is set up when the driver is loaded. This allows +userspace to bind-mount domain subtrees of /dev/kdbus into a container's +filesystem view, and hence achieve complete isolation from the host's domain +and those of other containers. + + +5.2 Endpoints +------------- + +Endpoints are entry points to a bus. By default, each bus has a default +endpoint called 'bus'. The bus owner has the ability to create custom +endpoints with specific names, permissions, and policy databases (see below). + +To create a custom endpoint, use the KDBUS_CMD_ENDPOINT_MAKE ioctl with struct +kdbus_cmd_make. Custom endpoints always have a policy db that, by default, +does not allow anything. Everything that users of this new endpoint should be +able to do has to be explicitly specified through KDBUS_ITEM_NAME and +KDBUS_ITEM_POLICY_ACCESS items. + +5.3 Creating domains, buses and endpoints +----------------------------------------- + +KDBUS_CMD_BUS_MAKE, KDBUS_CMD_DOMAIN_MAKE and KDBUS_CMD_ENDPOINT_MAKE take a +struct kdbus_cmd_make argument. + +struct kdbus_cmd_make { + __u64 size; + The overall size of the struct, including its items. + + __u64 flags; + The flags for creation. + + KDBUS_MAKE_ACCESS_GROUP + Make the device node group-accessible + + KDBUS_MAKE_ACCESS_WORLD + Make the device node world-accessible + + __u64 kernel_flags; + Valid flags for this command, returned by the kernel upon each call. + + struct kdbus_item items[0]; + A list of items, only used for creating custom endpoints. Ignored for + buses and domains. +}; + + +6. Connections +=============================================================================== + + +6.1 Connection IDs and well-known connection names +-------------------------------------------------- + +Connections are identified by their connection id, internally implemented as a +uint64_t counter. The IDs of every newly created bus start at 1, and every new +connection will increment the counter by 1. The ids are not reused. + +In higher level tools, the user visible representation of a connection is +defined by the D-Bus protocol specification as ":1.<id>". + +Messages with a specific uint64_t destination id are directly delivered to +the connection with the corresponding id. Messages with the special destination +id KDBUS_DST_ID_BROADCAST are broadcast messages and are potentially delivered +to all known connections on the bus; clients interested in broadcast messages +need to subscribe to the specific messages they are interested though, before +any broadcast message reaches them. + +Messages synthesized and sent directly by the kernel will carry the special +source id KDBUS_SRC_ID_KERNEL (0). + +In addition to the unique uint64_t connection id, established connections can +request the ownership of well-known names, under which they can be found and +addressed by other bus clients. A well-known name is associated with one and +only one connection at a time. See section 8 on name acquisition and the +name registry, and the validity of names. + +Messages can specify the special destination id 0 and carry a well-known name +in the message data. Such a message is delivered to the destination connection +which owns that well-known name. + + +-------------------------------------------------------------------------+ + | +---------------+ +---------------------------+ | + | | Connection | | Message | -----------------+ | + | | :1.22 | --> | src: 22 | | | + | | | | dst: 25 | | | + | | | | | | | + | | | | | | | + | | | +---------------------------+ | | + | | | | | + | | | <--------------------------------------+ | | + | +---------------+ | | | + | | | | + | +---------------+ +---------------------------+ | | | + | | Connection | | Message | -----+ | | + | | :1.25 | --> | src: 25 | | | + | | | | dst: 0xffffffffffffffff | -------------+ | | + | | | | (KDBUS_DST_ID_BROADCAST) | | | | + | | | | | ---------+ | | | + | | | +---------------------------+ | | | | + | | | | | | | + | | | <--------------------------------------------------+ | + | +---------------+ | | | + | | | | + | +---------------+ +---------------------------+ | | | + | | Connection | | Message | --+ | | | + | | :1.55 | --> | src: 55 | | | | | + | | | | dst: 0 / org.foo.bar | | | | | + | | | | | | | | | + | | | | | | | | | + | | | +---------------------------+ | | | | + | | | | | | | + | | | <------------------------------------------+ | | + | +---------------+ | | | + | | | | + | +---------------+ | | | + | | Connection | | | | + | | :1.81 | | | | + | | org.foo.bar | | | | + | | | | | | + | | | | | | + | | | <-----------------------------------+ | | + | | | | | + | | | <----------------------------------------------+ | + | +---------------+ | + +-------------------------------------------------------------------------+ + + +6.2 Creating connections +------------------------ + +A connection to a bus is created by opening an endpoint device node of +a bus and becoming an active client with the KDBUS_CMD_HELLO ioctl. Every +connected client connection has a unique identifier on the bus and can +address messages to every other connection on the same bus by using +the peer's connection id as the destination. + +The KDBUS_CMD_HELLO ioctl takes the following struct as argument. + +struct kdbus_cmd_hello { + __u64 size; + The overall size of the struct, including all attached items. + + __u64 conn_flags; + Flags to apply to this connection: + + KDBUS_HELLO_ACCEPT_FD + When this flag is set, the connection can be sent file descriptors + as message payload. If it's not set, any attempt of doing so will + result in -ECOMM on the sender's side. + + KDBUS_HELLO_ACTIVATOR + Make this connection an activator (see below). With this bit set, + an item of type KDBUS_ITEM_NAME has to be attached which describes + the well-known name this connection should be an activator for. + + KDBUS_HELLO_POLICY_HOLDER + Make this connection a policy holder (see below). With this bit set, + an item of type KDBUS_ITEM_NAME has to be attached which describes + the well-known name this connection should hold a policy for. + + KDBUS_HELLO_MONITOR + Make this connection an eaves-dropping connection that receives all + unicast messages sent on the bus. To also receive broadcast messages, + the connection has to upload appropriate matches as well. + This flag is only valid for privileged bus connections. + + __u64 attach_flags; + Request the attachment of metadata for each message received by this + connection. The metadata actually attached may actually augment the list + of requested items. See section 13 for more details. + + __u64 bus_flags; + Upon successful completion of the ioctl, this member will contain the + flags of the bus it connected to. + + __u64 id; + Upon successful completion of the ioctl, this member will contain the + id of the new connection. + + __u64 pool_size; + The size of the communication pool, in bytes. The pool can be accessed + by calling mmap() on the file descriptor that was used to issue the + KDBUS_CMD_HELLO ioctl. + + struct kdbus_bloom_parameter bloom; + Bloom filter parameter (see below). + + __u8 id128[16]; + Upon successful completion of the ioctl, this member will contain the + 128 bit wide UUID of the connected bus. + + struct kdbus_item items[0]; + Variable list of items to add optional additional information. The + following items are currently expected/valid: + + KDBUS_ITEM_CONN_NAME + Contains a string to describes this connection's name, so it can be + identified later. + + KDBUS_ITEM_NAME + KDBUS_ITEM_POLICY_ACCESS + For activators and policy holders only, combinations of these two + items describe policy access entries (see section about policy db). + + KDBUS_ITEM_CREDS + KDBUS_ITEM_SECLABEL + Privileged bus users may submit these types in order to create + connections with faked credentials. The only real use case for this + is a proxy service which acts on behalf of some other tasks. For a + connection that runs in that mode, the message's metadata items will + be limited to what's specified here. See section 13 for more + information. + + Items of other types are silently ignored. +}; + + +6.3 Activator and policy holder connection +------------------------------------------ + +An activator connection is a placeholder for a well-known name. Messages sent +to such a connection can be used by userspace to start an implementor +connection, which will then get all the messages from the activator copied +over. An activator connection cannot be used to send any message. + +A policy holder connection only installs a policy for one or more names. +These policy entries are kept active as long as the connection is alive, and +are removed once it terminates. Such a policy connection type can be used to +deploy restrictions for names that are not yet active on the bus. A policy +holder connection cannot be used to send any message. + +The creation of activator, policy holder or monitor connections is an operation +restricted to privileged users on the bus (see section "Terminology"). + + +6.4 Retrieving information on a connection +------------------------------------------ + +The KDBUS_CMD_CONN_INFO ioctl can be used to retrieve credentials and +properties of the initial creator of a connection. This ioctl uses the +following struct: + +struct kdbus_cmd_info { + __u64 size; + The overall size of the struct, including the name with its 0-byte string + terminator. + + __u64 flags; + Specify which items should be attached to the answer. + The following flags can be used: + + KDBUS_ATTACH_NAMES + Add an item to the answer containing all the names the connection + currently owns. + + KDBUS_ATTACH_CONN_NAME + Add an item to the answer containing the connection's name. + + After the ioctl returns, this field will contain the current metadata + attach flags of the connection. + + __u64 kernel_flags; + Valid flags for this command, returned by the kernel upon each call. + + __u64 id; + The connection's numerical ID to retrieve information for. If set to + non-zero value, the 'name' field is ignored. + + __u64 offset; + When the ioctl returns, this value will yield the offset of the connection + information inside the caller's pool. + + struct kdbus_item items[0]; + The optional item list, containing the well-known name to look up as + a KDBUS_ITEM_NAME. Only required if the 'id' field is set to 0. + All other items are currently ignored. +}; + +After the ioctl returns, the following struct will be stored in the caller's +pool at 'offset'. + +struct kdbus_info { + __u64 size; + The overall size of the struct, including all its items. + + __u64 id; + The connection's unique ID. + + __u64 flags; + The connection's flags as specified when it was created. + + __u64 kernel_flags; + Valid flags for this command, returned by the kernel upon each call. + + struct kdbus_item items[0]; + Depending on the 'flags' field in struct kdbus_cmd_info, items of + types KDBUS_ITEM_NAME and KDBUS_ITEM_CONN_NAME are followed here. +}; + +Once the caller is finished with parsing the return buffer, it needs to call +KDBUS_CMD_FREE for the offset. + + +6.5 Getting information about a connection's bus creator +-------------------------------------------------------- + +The KDBUS_CMD_BUS_CREATOR_INFO ioctl takes the same struct as +KDBUS_CMD_CONN_INFO but is used to retrieve information about the creator of +the bus the connection is attached to. The metadata returned by this call is +collected during the creation of the bus and is never altered afterwards, so +it provides pristine information on the task that created the bus, at the +moment when it did so. + +In response to this call, a slice in the connection's pool is allocated and +filled with an object of type struct kdbus_info, pointed to by the ioctl's +'offset' field. + +struct kdbus_info { + __u64 size; + The overall size of the struct, including all its items. + + __u64 id; + The bus' ID + + __u64 flags; + The bus' flags as specified when it was created. + + __u64 kernel_flags; + Valid flags for this command, returned by the kernel upon each call. + + struct kdbus_item items[0]; + Metadata information is stored in items here. +}; + +Once the caller is finished with parsing the return buffer, it needs to call +KDBUS_CMD_FREE for the offset. + + +6.6 Updating connection details +------------------------------- + +Some of a connection's details can be updated with the KDBUS_CMD_CONN_UPDATE +ioctl, using the file descriptor that was used to create the connection. +The update command uses the following struct. + +struct kdbus_cmd_update { + __u64 size; + The overall size of the struct, including all its items. + + struct kdbus_item items[0]; + Items to describe the connection details to be updated. The following item + types are supported: + + KDBUS_ITEM_ATTACH_FLAGS + Supply a new set of items to be attached to each message. + + KDBUS_ITEM_NAME + KDBUS_ITEM_POLICY_ACCESS + Policy holder connections may supply a new set of policy information + with these items. For other connection types, -EOPNOTSUPP is returned. +}; + + +6.6 Termination +--------------- + +A connection can be terminated by simply closing the file descriptor that was +used to start the connection. All pending incoming messages will be discarded, +and the memory in the pool will be freed. + +An alternative way of way of closing down a connection is calling the +KDBUS_CMD_BYEBYE ioctl on it, which will only succeed if the message queue +of the connection is empty at the time of closing, otherwise, -EBUSY is +returned. + +When this ioctl returns successfully, the connection has been terminated and +won't accept any new messages from remote peers. This way, a connection can +be terminated race-free, without losing any messages. + + +7. Messages +=============================================================================== + +Messages consist of a fixed-size header followed directly by a list of +variable-sized data 'items'. The overall message size is specified in the +header of the message. The chain of data items can contain well-defined +message metadata fields, raw data, references to data, or file descriptors. + + +7.1 Sending messages +-------------------- + +Messages are passed to the kernel with the KDBUS_CMD_MSG_SEND ioctl. Depending +on the the destination address of the message, the kernel delivers the message +to the specific destination connection or to all connections on the same bus. +Sending messages across buses is not possible. Messages are always queued in +the memory pool of the destination connection (see below). + +The KDBUS_CMD_MSG_SEND ioctl uses struct kdbus_msg to describe the message to +be sent. + +struct kdbus_msg { + __u64 size; + The over all size of the struct, including the attached items. + + __u64 flags; + Flags for message delivery: + + KDBUS_MSG_FLAGS_EXPECT_REPLY + Expect a reply from the remote peer to this message. With this bit set, + the timeout_ns field must be set to a non-zero number of nanoseconds in + which the receiving peer is expected to reply. If such a reply is not + received in time, the sender will be notified with a timeout message + (see below). The value must be an absolute value, in nanoseconds and + based on CLOCK_MONOTONIC. + + For a message to be accepted as reply, it must be a direct message to + the original sender (not a broadcast), and its kdbus_msg.reply_cookie + must match the previous message's kdbus_msg.cookie. + + Expected replies also temporarily open the policy of the sending + connection, so the other peer is allowed to respond within the given + time window. + + KDBUS_MSG_FLAGS_SYNC_REPLY + By default, all calls to kdbus are considered asynchronous, + non-blocking. However, as there are many use cases that need to wait + for a remote peer to answer a method call, there's a way to send a + message and wait for a reply in a synchronous fashion. This is what + the KDBUS_MSG_FLAGS_SYNC_REPLY controls. The KDBUS_CMD_MSG_SEND ioctl + will block until the reply has arrived, the timeout limit is reached, + in case the remote connection was shut down, or if interrupted by + a signal before any reply; see signal(7). + + The offset of the reply message in the sender's pool is stored in + in 'offset_reply' when the ioctl has returned without error. Hence, + there is no need for another KDBUS_CMD_MSG_RECV ioctl or anything else + to receive the reply. + + KDBUS_MSG_FLAGS_NO_AUTO_START + By default, when a message is sent to an activator connection, the + activator notified and will start an implementor. This flag inhibits + that behavior. With this bit set, and the remote being an activator, + -EADDRNOTAVAIL is returned from the ioctl. + + __u64 kernel_flags; + Valid flags for this command, returned by the kernel upon each call of + KDBUS_MSG_SEND. + + __s64 priority; + The priority of this message. Receiving messages (see below) may + optionally be constrained to messages of a minimal priority. This + allows for use cases where timing critical data is interleaved with + control data on the same connection. If unused, the priority should be + set to zero. + + __u64 dst_id; + The numeric ID of the destination connection, or KDBUS_DST_ID_BROADCAST + (~0ULL) to address every peer on the bus, or KDBUS_DST_ID_NAME (0) to look + it up dynamically from the bus' name registry. In the latter case, an item + of type KDBUS_ITEM_DST_NAME is mandatory. + + __u64 src_id; + Upon return of the ioctl, this member will contain the sending + connection's numerical ID. Should be 0 at send time. + + __u64 payload_type; + Type of the payload in the actual data records. Currently, only + KDBUS_PAYLOAD_DBUS is accepted as input value of this field. When + receiving messages that are generated by the kernel (notifications), + this field will yield KDBUS_PAYLOAD_KERNEL. + + __u64 cookie; + Cookie of this message, for later recognition. Also, when replying + to a message (see above), the cookie_reply field must match this value. + + __u64 timeout_ns; + If the message sent requires a reply from the remote peer (see above), + this field contains the timeout in absolute nanoseconds based on + CLOCK_MONOTONIC. + + __u64 cookie_reply; + If the message sent is a reply to another message, this field must + match the cookie of the formerly received message. + + __u64 offset_reply; + If the message successfully got a synchronous reply (see above), this + field will yield the offset of the reply message in the sender's pool. + Is is what KDBUS_CMD_MSG_RECV usually does for asynchronous messages. + + struct kdbus_item items[0]; + A dynamically sized list of items to contain additional information. + The following items are expected/valid: + + KDBUS_ITEM_PAYLOAD_VEC + KDBUS_ITEM_PAYLOAD_MEMFD + KDBUS_ITEM_FDS + Actual data records containing the payload. See section "Passing of + Payload Data". + + KDBUS_ITEM_BLOOM_FILTER + Bloom filter for matches (see below). + + KDBUS_ITEM_DST_NAME + Well-known name to send this message to. Required if dst_id is set + to KDBUS_DST_ID_NAME. If a connection holding the given name can't + be found, -ESRCH is returned. + For messages to a unique name (ID), this item is optional. If present, + the kernel will make sure the name owner matches the given unique name. + This allows userspace tie the message sending to the condition that a + name is currently owned by a certain unique name. +}; + +The message will be augmented by the requested metadata items when queued into +the receiver's pool. See also section 13.1 ("Metadata and namespaces"). + + +7.2 Message layout +------------------ + +The layout of a message is shown below. + + +-------------------------------------------------------------------------+ + | Message | + | +---------------------------------------------------------------------+ | + | | Header | | + | | size: overall message size, including the data records | | + | | destination: connection id of the receiver | | + | | source: connection id of the sender (set by kernel) | | + | | payload_type: "DBusDBus" textual identifier stored as uint64_t | | + | +---------------------------------------------------------------------+ | + | +---------------------------------------------------------------------+ | + | | Data Record | | + | | size: overall record size (without padding) | | + | | type: type of data | | + | | data: reference to data (address or file descriptor) | | + | +---------------------------------------------------------------------+ | + | +---------------------------------------------------------------------+ | + | | padding bytes to the next 8 byte alignment | | + | +---------------------------------------------------------------------+ | + | +---------------------------------------------------------------------+ | + | | Data Record | | + | | size: overall record size (without padding) | | + | | ... | | + | +---------------------------------------------------------------------+ | + | +---------------------------------------------------------------------+ | + | | padding bytes to the next 8 byte alignment | | + | +---------------------------------------------------------------------+ | + | +---------------------------------------------------------------------+ | + | | Data Record | | + | | size: overall record size | | + | | ... | | + | +---------------------------------------------------------------------+ | + | +---------------------------------------------------------------------+ | + | | padding bytes to the next 8 byte alignment | | + | +---------------------------------------------------------------------+ | + +-------------------------------------------------------------------------+ + + +7.3 Passing of Payload Data +--------------------------- + +When connecting to the bus, receivers request a memory pool of a given size, +large enough to carry all backlog of data enqueued for the connection. The +pool is internally backed by a shared memory file which can be mmap()ed by +the receiver. + +KDBUS_MSG_PAYLOAD_VEC: + Messages are directly copied by the sending process into the receiver's pool, + that way two peers can exchange data by effectively doing a single-copy from + one process to another, the kernel will not buffer the data anywhere else. + +KDBUS_MSG_PAYLOAD_MEMFD: + Messages can reference memfd files which contain the data. + memfd files are tmpfs-backed files that allow sealing of the content of the + file, which prevents all writable access to the file content. + Only sealed memfd files are accepted as payload data, which enforces + reliable passing of data; the receiver can assume that neither the sender nor + anyone else can alter the content after the message is sent. + +Apart from the sender filling-in the content into memfd files, the data will +be passed as zero-copy from one process to another, read-only, shared between +the peers. + + +7.4 Receiving messages +---------------------- + +Messages are received by the client with the KDBUS_CMD_MSG_RECV ioctl. The +endpoint device node of the bus supports poll() to wake up the receiving +process when new messages are queued up to be received. + +With the KDBUS_CMD_MSG_RECV ioctl, a struct kdbus_cmd_recv is used. + +struct kdbus_cmd_recv { + __u64 flags; + Flags to control the receive command. + + KDBUS_RECV_PEEK + Just return the location of the next message. Do not install file + descriptors or anything else. This is usually used to determine the + sender of the next queued message. + + KDBUS_RECV_DROP + Drop the next message without doing anything else with it, and free the + pool slice. This a short-cut for KDBUS_RECV_PEEK and KDBUS_CMD_FREE. + + KDBUS_RECV_USE_PRIORITY + Use the priority field (see below). + + __u64 kernel_flags; + Valid flags for this command, returned by the kernel upon each call. + + __s64 priority; + With KDBUS_RECV_USE_PRIORITY set in flags, receive the next message in + the queue with at least the given priority. If no such message is waiting + in the queue, -ENOMSG is returned. + + __u64 offset; + Upon return of the ioctl, this field contains the offset in the + receiver's memory pool. +}; + +Unless KDBUS_RECV_DROP was passed, and given that the ioctl succeeded, the +offset field contains the location of the new message inside the receiver's +pool. The message is stored as struct kdbus_msg at this offset, and can be +interpreted with the semantics described above. + +Also, if the connection allowed for file descriptor to be passed +(KDBUS_HELLO_ACCEPT_FD), and if the message contained any, they will be +installed into the receiving process after the KDBUS_CMD_MSG_RECV ioctl +returns. The receiving task is obliged to close all of them appropriately. + +The caller is obliged to call KDBUS_CMD_FREE with the returned offset when +the memory is no longer needed. + + +7.5 Canceling messages synchronously waiting for replies +-------------------------------------------------------- + +When a connection sends a message with KDBUS_MSG_FLAGS_SYNC_REPLY and +blocks while waiting for the reply, the KDBUS_CMD_MSG_CANCEL ioctl can be +used on the same file descriptor to cancel the message, based on its cookie. +If there are multiple messages with the same cookie that are all synchronously +waiting for a reply, all of them will be canceled. Obviously, this is only +possible in multi-threaded applications. + + +8. Name registry +=============================================================================== + +Each bus instantiates a name registry to resolve well-known names into unique +connection IDs for message delivery. The registry will be queried when a +message is sent with kdbus_msg.dst_id set to KDBUS_DST_ID_NAME, or when a +registry dump is requested. + +All of the below is subject to policy rules for SEE and OWN permissions. + + +8.1 Name validity +----------------- + +A name has to comply to the following rules to be considered valid: + + - The name has two or more elements separated by a period ('.') character + - All elements must contain at least one character + - Each element must only contain the ASCII characters "[A-Z][a-z][0-9]_" + and must not begin with a digit + - The name must contain at least one '.' (period) character + (and thus at least two elements) + - The name must not begin with a '.' (period) character + - The name must not exceed KDBUS_NAME_MAX_LEN (255) + + +8.2 Acquiring a name +-------------------- + +To acquire a name, a client uses the KDBUS_CMD_NAME_ACQUIRE ioctl with the +following data structure. + +struct kdbus_cmd_name { + __u64 size; + The overall size of this struct, including the name with its 0-byte string + terminator. + + __u64 flags; + Flags to control details in the name acquisition. + + KDBUS_NAME_REPLACE_EXISTING + Acquiring a name that is already present usually fails, unless this flag + is set in the call, and KDBUS_NAME_ALLOW_REPLACEMENT or (see below) was + set when the current owner of the name acquired it, or if the current + owner is an activator connection (see below). + + KDBUS_NAME_ALLOW_REPLACEMENT + Allow other connections to take over this name. When this happens, the + former owner of the connection will be notified of the name loss. + + KDBUS_NAME_QUEUE (acquire) + A name that is already acquired by a connection, and which wasn't + requested with the KDBUS_NAME_ALLOW_REPLACEMENT flag set can not be + acquired again. However, a connection can put itself in a queue of + connections waiting for the name to be released. Once that happens, the + first connection in that queue becomes the new owner and is notified + accordingly. + + __u64 kernel_flags; + Valid flags for this command, returned by the kernel upon each call. + + struct kdbus_item items[0]; + Items to submit the name. Currently, one one item of type KDBUS_ITEM_NAME + is expected and allowed, and the contained string must be a valid bus name. +}; + + +8.3 Releasing a name +-------------------- + +A connection may release a name explicitly with the KDBUS_CMD_NAME_RELEASE +ioctl. If the connection was an implementor of an activatable name, its +pending messages are moved back to the activator. If there are any connections +queued up as waiters for the name, the oldest one of them will become the new +owner. The same happens implicitly for all names once a connection terminates. + +The KDBUS_CMD_NAME_RELEASE ioctl uses the same data structure as the +acquisition call, but with slightly different field usage. + +struct kdbus_cmd_name { + __u64 size; + The overall size of this struct, including the name with its 0-byte string + terminator. + + __u64 flags; + + struct kdbus_item items[0]; + Items to submit the name. Currently, one one item of type KDBUS_ITEM_NAME + is expected and allowed, and the contained string must be a valid bus name. +}; + + +8.4 Dumping the name registry +----------------------------- + +A connection may request a complete or filtered dump of currently active bus +names with the KDBUS_CMD_NAME_LIST ioctl, which takes a struct +kdbus_cmd_name_list as argument. + +struct kdbus_cmd_name_list { + __u64 flags; + Any combination of flags to specify which names should be dumped. + + KDBUS_NAME_LIST_UNIQUE + List the unique (numeric) IDs of the connection, whether it owns a name + or not. + + KDBUS_NAME_LIST_NAMES + List well-known names stored in the database which are actively owned by + a real connection (not an activator). + + KDBUS_NAME_LIST_ACTIVATORS + List names that are owned by an activator. + + KDBUS_NAME_LIST_QUEUED + List connections that are not yet owning a name but are waiting for it + to become available. + + __u64 offset; + When the ioctl returns successfully, the offset to the name registry dump + inside the connection's pool will be stored in this field. +}; + +The returned list of names is stored in a struct kdbus_name_list that in turn +contains a dynamic number of struct kdbus_cmd_name that carry the actual +information. The fields inside that struct kdbus_cmd_name is described next. + +struct kdbus_name_info { + __u64 size; + The overall size of this struct, including the name with its 0-byte string + terminator. + + __u64 flags; + The current flags for this name. Can be any combination of + + KDBUS_NAME_ALLOW_REPLACEMENT + + KDBUS_NAME_IN_QUEUE (list) + When retrieving a list of currently acquired name in the registry, this + flag indicates whether the connection actually owns the name or is + currently waiting for it to become available. + + KDBUS_NAME_ACTIVATOR (list) + An activator connection owns a name as a placeholder for an implementor, + which is started on demand as soon as the first message arrives. There's + some more information on this topic below. In contrast to + KDBUS_NAME_REPLACE_EXISTING, when a name is taken over from an activator + connection, all the messages that have been queued in the activator + connection will be moved over to the new owner. The activator connection + will still be tracked for the name and will take control again if the + implementor connection terminates. + This flag can not be used when acquiring a name, but is implicitly set + through KDBUS_CMD_HELLO with KDBUS_HELLO_ACTIVATOR set in + kdbus_cmd_hello.conn_flags. + + __u64 owner_id; + The owning connection's unique ID. + + __u64 conn_flags; + The flags of the owning connection. + + struct kdbus_item items[0]; + Items containing the actual name. Currently, one one item of type + KDBUS_ITEM_NAME will be attached. +}; + +The returned buffer must be freed with the KDBUS_CMD_FREE ioctl when the user +is finished with it. + + +9. Notifications +=============================================================================== + +The kernel will notify its users of the following events. + + * When connection A is terminated while connection B is waiting for a reply + from it, connection B is notified with a message with an item of type + KDBUS_ITEM_REPLY_DEAD. + + * When connection A does not receive a reply from connection B within the + specified timeout window, connection A will receive a message with an item + of type KDBUS_ITEM_REPLY_TIMEOUT. + + * When a connection is created on or removed from a bus, messages with an + item of type KDBUS_ITEM_ID_ADD or KDBUS_ITEM_ID_REMOVE, respectively, are + sent to all bus members that match these messages through their match + database. + + * When a connection owns or loses a name, or a name is moved from one + connection to another, messages with an item of type KDBUS_ITEM_NAME_ADD, + KDBUS_ITEM_NAME_REMOVE or KDBUS_ITEM_NAME_CHANGE are sent to all bus + members that match these messages through their match database. + +A kernel notification is a regular kdbus message with the following details. + + * kdbus_msg.src_id == KDBUS_SRC_ID_KERNEL + * kdbus_msg.dst_id == KDBUS_DST_ID_BROADCAST + * kdbus_msg.payload_type == KDBUS_PAYLOAD_KERNEL + * Has exactly one of the aforementioned items attached + + +10. Message Matching, Bloom filters +=============================================================================== + +10.1 Matches for broadcast messages from other connections +---------------------------------------------------------- + +A message addressed at the connection ID KDBUS_DST_ID_BROADCAST (~0ULL) is a +broadcast message, delivered to all connected peers which installed a rule to +match certain properties of the message. Without any rules installed in the +connection, no broadcast message or kernel-side notifications will be delivered +to the connection. Broadcast messages are subject to policy rules and TALK +access checks. + +See section 11 for details on policies, and section 11.5 for more +details on implicit policies. + +Matches for messages from other connections (not kernel notifications) are +implemented as bloom filters. The sender adds certain properties of the message +as elements to a bloom filter bit field, and sends that along with the +broadcast message. + +The connection adds the message properties it is interested as elements to a +bloom mask bit field, and uploads the mask to the match rules of the +connection. + +The kernel will match the broadcast message's bloom filter against the +connections bloom mask (simply by &-ing it), and decide whether the message +should be delivered to the connection. + +The kernel has no notion of any specific properties of the message, all it +sees are the bit fields of the bloom filter and mask to match against. The +use of bloom filters allows simple and efficient matching, without exposing +any message properties or internals to the kernel side. Clients need to deal +with the fact that they might receive broadcasts which they did not subscribe +to, as the bloom filter might allow false-positives to pass the filter. + +To allow the future extension of the set of elements in the bloom filter, the +filter specifies a "generation" number. A later generation must always contain +all elements of the set of the previous generation, but can add new elements +to the set. The match rules mask can carry an array with all previous +generations of masks individually stored. When the filter and mask are matched +by the kernel, the mask with the closest matching "generation" is selected +as the index into the mask array. + + +10.2 Matches for kernel notifications +------------------------------------ + +To receive kernel generated notifications (see section 9), a connection must +install special match rules that are different from the bloom filter matches +described in the section above. They can be filtered by a sender connection's +ID, by one of the name the sender connection owns at the time of sending the +message, or by type of the notification (id/name add/remove/change). + +10.3 Adding a match +------------------- + +To add a match, the KDBUS_CMD_MATCH_ADD ioctl is used, which takes a struct +of the struct described below. + +Note that each of the items attached to this command will internally create +one match 'rule', and the collection of them, which is submitted as one block +via the ioctl is called a 'match'. To allow a message to pass, all rules of a +match have to be satisfied. Hence, adding more items to the command will only +narrow the possibility of a match to effectively let the message pass, and will +cause the connection's user space process to wake up less likely. + +Multiple matches can be installed per connection. As long as one of it has a +set of rules which allows the message to pass, this one will be decisive. + +struct kdbus_cmd_match { + __u64 size; + The overall size of the struct, including its items. + + __u64 cookie; + A cookie which identifies the match, so it can be referred to at removal + time. + + __u64 flags; + Flags to control the behavior of the ioctl. + + KDBUS_MATCH_REPLACE: + Remove all entries with the given cookie before installing the new one. + This allows for race-free replacement of matches. + + struct kdbus_item items[0]; + Items to define the actual rules of the matches. The following item types + are expected. Each item will cause one new match rule to be created. + + KDBUS_ITEM_BLOOM_MASK + An item that carries the bloom filter mask to match against in its + data field. The payload size must match the bloom filter size that + was specified when the bus was created. + See section 10.4 for more information. + + KDBUS_ITEM_NAME + Specify a name that a sending connection must own at a time of sending + a broadcast message in order to match this rule. + + KDBUS_ITEM_ID + Specify a sender connection's ID that will match this rule. + + KDBUS_ITEM_NAME_ADD + KDBUS_ITEM_NAME_REMOVE + KDBUS_ITEM_NAME_CHANGE + These items request delivery of broadcast messages that describe a name + acquisition, loss, or change. The details are stored in the item's + kdbus_notify_name_change member. All information specified must be + matched in order to make the message pass. Use KDBUS_MATCH_ID_ANY to + match against any unique connection ID. + + KDBUS_ITEM_ID_ADD + KDBUS_ITEM_ID_REMOVE + These items request delivery of broadcast messages that are generated + when a connection is created or terminated. struct kdbus_notify_id_change + is used to store the actual match information. This item can be used to + monitor one particular connection ID, or, when the id field is set to + KDBUS_MATCH_ID_ANY, all of them. + + Other item types are ignored. +}; + + +10.4 Bloom filters +------------------ + +Bloom filters allow checking whether a given word is present in a dictionary. +This allows connections to set up a mask for information it is interested in, +and will be delivered broadcast messages that have a matching filter. + +For general information on bloom filters, see + + https://en.wikipedia.org/wiki/Bloom_filter + +The size of the bloom filter is defined per bus when it is created, in +kdbus_bloom_parameter.size. All bloom filters attached to broadcast messages +on the bus must match this size, and all bloom filter matches uploaded by +connections must also match the size, or a multiple thereof (see below). + +The calculation of the mask has to be done on the userspace side. The kernel +just checks the bitmasks to decide whether or not to let the message pass. All +bits in the mask must match the filter in and bit-wise AND logic, but the +mask may have more bits set than the filter. Consequently, false positive +matches are expected to happen, and userspace must deal with that fact. + +Masks are entities that are always passed to the kernel as part of a match +(with an item of type KDBUS_ITEM_BLOOM_MASK), and filters can be attached to +broadcast messages (with an item of type KDBUS_ITEM_BLOOM_FILTER). + +For a broadcast to match, all set bits in the filter have to be set in the +installed match mask as well. For example, consider a bus has a bloom size +of 8 bytes, and the following mask/filter combinations: + + filter 0x0101010101010101 + mask 0x0101010101010101 + -> matches + + filter 0x0303030303030303 + mask 0x0101010101010101 + -> doesn't match + + filter 0x0101010101010101 + mask 0x0303030303030303 + -> matches + +Hence, in order to catch all messages, a mask filled with 0xff bytes can be +installed as a wildcard match rule. + +Uploaded matches may contain multiple masks, each of which in the size of the +bloom size defined by the bus. Each block of a mask is called a 'generation', +starting at index 0. + +At match time, when a broadcast message is about to be delivered, a bloom +mask generation is passed, which denotes which of the bloom masks the filter +should be matched against. This allows userspace to provide backward compatible +masks at upload time, while older clients can still match against older +versions of filters. + + +10.5 Removing a match +-------------------- + +Matches can be removed through the KDBUS_CMD_MATCH_REMOVE ioctl, which again +takes struct kdbus_cmd_match as argument, but its fields are used slightly +differently. + +struct kdbus_cmd_match { + __u64 size; + The overall size of the struct. As it has no items in this use case, the + value should yield 16. + + __u64 cookie; + The cookie of the match, as it was passed when the match was added. + All matches that have this cookie will be removed. + + __u64 flags; + Unused for this use case, + + __u64 kernel_flags; + Valid flags for this command, returned by the kernel upon each call. + + struct kdbus_item items[0]; + Unused for this use case. +}; + + +11. Policy +=============================================================================== + +A policy databases restrict the possibilities of connections to own, see and +talk to well-known names. It can be associated with a bus (through a policy +holder connection) or a custom endpoint. + +See section 8.1 for more details on the validity of well-known names. + +Default endpoints of buses always have a policy database. The default +policy is to deny all operations except for operations that are covered by +implicit policies. Custom endpoints always have a policy, and by default, +a policy database is empty. Therefore, unless policy rules are added, all +operations will also be denied by default. + +See section 11.5 for more details on implicit policies. + +A set of policy rules is described by a name and multiple access rules, defined +by the following struct. + +struct kdbus_policy_access { + __u64 type; /* USER, GROUP, WORLD */ + One of the following. + + KDBUS_POLICY_ACCESS_USER + Grant access to a user with the uid stored in the 'id' field. + + KDBUS_POLICY_ACCESS_GROUP + Grant access to a user with the gid stored in the 'id' field. + + KDBUS_POLICY_ACCESS_WORLD + Grant access to everyone. The 'id' field is ignored. + + __u64 access; /* OWN, TALK, SEE */ + The access to grant. + + KDBUS_POLICY_SEE + Allow the name to be seen. + + KDBUS_POLICY_TALK + Allow the name to be talked to. + + KDBUS_POLICY_OWN + Allow the name to be owned. + + __u64 id; + For KDBUS_POLICY_ACCESS_USER, stores the uid. + For KDBUS_POLICY_ACCESS_GROUP, stores the gid. +}; + +Policies are set through KDBUS_CMD_HELLO (when creating a policy holder +connection), KDBUS_CMD_CONN_UPDATE (when updating a policy holder connection), +KDBUS_CMD_ENDPOINT_MAKE (creating a custom endpoint) or +KDBUS_CMD_ENDPOINT_UPDATE (updating a custom endpoint). In all cases, the name +and policy access information is stored in items of type KDBUS_ITEM_NAME and +KDBUS_ITEM_POLICY_ACCESS. For this transport, the following rules apply. + + * An item of type KDBUS_ITEM_NAME must be followed by at least one + KDBUS_ITEM_POLICY_ACCESS item + * An item of type KDBUS_ITEM_NAME can be followed by an arbitrary number of + KDBUS_ITEM_POLICY_ACCESS items + * An arbitrary number of groups of names and access levels can be passed + +uids and gids are internally always stored in the kernel's view of global ids, +and are translated back and forth on the ioctl level accordingly. + + +11.2 Wildcard names +------------------- + +Policy holder connections may upload names that contain the wildcard suffix +(".*"). That way, a policy can be uploaded that is effective for every +well-kwown name that extends the provided name by exactly one more level. + +For example, if an item of a set up uploaded policy rules contains the name +"foo.bar.*", both "foo.bar.baz" and "foo.bar.bazbaz" are valid, but +"foo.bar.baz.baz" is not. + +This allows connections to take control over multiple names that the policy +holder doesn't need to know about when uploading the policy. + +Such wildcard entries are not allowed for custom endpoints. + + +11.3 Policy example +------------------- + +For example, a set of policy rules may look like this: + + KDBUS_ITEM_NAME: str='org.foo.bar' + KDBUS_ITEM_POLICY_ACCESS: type=USER, access=OWN, id=1000 + KDBUS_ITEM_POLICY_ACCESS: type=USER, access=TALK, id=1001 + KDBUS_ITEM_POLICY_ACCESS: type=WORLD, access=SEE + KDBUS_ITEM_NAME: str='org.blah.baz' + KDBUS_ITEM_POLICY_ACCESS: type=USER, access=OWN, id=0 + KDBUS_ITEM_POLICY_ACCESS: type=WORLD, access=TALK + +That means that 'org.foo.bar' may only be owned by uid 1000, but every user on +the bus is allowed to see the name. However, only uid 1001 may actually send +a message to the connection and receive a reply from it. + +The second rule allows 'org.blah.baz' to be owned by uid 0 only, but every user +may talk to it. + + +11.4 TALK access and multiple well-known names per connection +------------------------------------------------------------- + +Note that TALK access is checked against all names of a connection. +For example, if a connection owns both 'org.foo.bar' and 'org.blah.baz', and +the policy database allows 'org.blah.baz' to be talked to by WORLD, then this +permission is also granted to 'org.foo.bar'. That might sound illogical, but +after all, we allow messages to be directed to either the name or a well-known +name, and policy is applied to the connection, not the name. In other words, +the effective TALK policy for a connection is the most permissive of all names +the connection owns. + +If a policy database exists for a bus (because a policy holder created one on +demand) or for a custom endpoint (which always has one), each one is consulted +during name registry listing, name owning or message delivery. If either one +fails, the operation is failed with -EPERM. + +For best practices, connections that own names with a restricted TALK +access should not install matches. This avoids cases where the sent +message may pass the bloom filter due to false-positives and may also +satisfy the policy rules. + +11.5 Implicit policies +---------------------- + +Depending on the type of the endpoint, a set of implicit rules might be +enforced. On default endpoints, the following set is enforced: + + * Privileged connections always override any installed policy. Those + connections could easily install their own policies, so there is no + reason to enforce installed policies. + * Connections can always talk to connections of the same user. This + includes broadcast messages. + * Connections that own names might send broadcast messages to other + connections that belong to a different user, but only if that + destination connection does not own any name. + +Custom endpoints have stricter policies. The following rules apply: + + * Policy rules are always enforced, even if the connection is a privileged + connection. + * Policy rules are always enforced for TALK access, even if both ends are + running under the same user. This includes broadcast messages. + * To restrict the set of names that can be seen, endpoint policies can + install "SEE" policies. + + +12. Pool +=============================================================================== + +A pool for data received from the kernel is installed for every connection of +the bus, and is sized according to kdbus_cmd_hello.pool_size. It is accessed +when one of the following ioctls is issued: + + * KDBUS_CMD_MSG_RECV, to receive a message + * KDBUS_CMD_NAME_LIST, to dump the name registry + * KDBUS_CMD_CONN_INFO, to retrieve information on a connection + +Internally, the pool is organized in slices, stored in an rb-tree. The offsets +returned by either one of the aforementioned ioctls describe offsets inside the +pool. In order to make the slice available for subsequent calls, KDBUS_CMD_FREE +has to be called on the offset. + +To access the memory, the caller is expected to mmap() it to its task, like +this: + + /* + * POOL_SIZE has to be a multiple of PAGE_SIZE, and it must match the + * value that was previously passed in the .pool_size field of struct + * kdbus_cmd_hello. + */ + + buf = mmap(NULL, POOL_SIZE, PROT_READ, MAP_PRIVATE, conn_fd, 0); + + +13. Metadata +=============================================================================== + +When a message is delivered to a receiver connection, it is augmented by +metadata items in accordance to the destination's current attach flags. The +information stored in those metadata items refer to the sender task at the +time of sending the message, so even if any detail of the sender task has +already changed upon message reception (or if the sender task does not exist +anymore), the information is still preserved and won't be modfied until the +message is freed. + +Note that there are two exceptions to the above rules: + + a) Kernel generated messages don't have a source connection, so they won't be + augmented. + + b) If a connection was created with faked credentials (see section 6.2), + the only attached metadata items are the ones provided by the connection + itself. The destination's attach_flags won't be looked at in such cases. + +Also, there are two things to be considered by userspace programs regarding +those metadata items: + + a) Userspace must cope with the fact that it might get more metadata than + they requested. That happens, for example, when a broadcast message is + sent and receivers have different attach flags. Items that haven't been + requested should hence be silently ignored. + + b) Userspace might not always get all requested metadata items that it + requested. That is because some of those items are only added if a + corresponding kernel feature has been enabled. Also, the two exceptions + described above will as well lead to less items be attached than + requested. + + +13.1 Known item types +--------------------- + +The following attach flags are currently supported. + + KDBUS_ATTACH_TIMESTAMP + Attaches an item of type KDBUS_ITEM_TIMESTAMP which contains both the + monotonic and the realtime timestamp, taken when the message was + processed on the kernel side. + + KDBUS_ATTACH_CREDS + Attaches an item of type KDBUS_ITEM_CREDS, containing credentials as + described in kdbus_creds: the uid, gid, pid, tid and starttime of the task. + + KDBUS_ATTACH_AUXGROUPS + Attaches an item of type KDBUS_ITEM_AUXGROUPS, containing a dynamic + number of auxiliary groups the sending task was a member of. + + KDBUS_ATTACH_NAMES + Attaches items of type KDBUS_ITEM_NAME, one for each name the sending + connection currently owns. The name is stored in kdbus_item.str for each + of them. + + KDBUS_ATTACH_COMM + Attaches an items of type KDBUS_ITEM_PID_COMM and KDBUS_ITEM_TID_COMM, + both transporting the sending task's 'comm', for both the pid and the tid. + The strings are stored in kdbus_item.str. + + KDBUS_ATTACH_EXE + Attaches an item of type KDBUS_ITEM_EXE, containing the path to the + executable of the sending task, stored in kdbus_item.str. + + KDBUS_ATTACH_CMDLINE + Attaches an item of type KDBUS_ITEM_CMDLINE, containing the command line + arguments of the sending task, as an array of strings, stored in + kdbus_item.str. + + KDBUS_ATTACH_CGROUP + Attaches an item of type KDBUS_ITEM_CGROUP with the task's cgroup path. + + KDBUS_ATTACH_CAPS + Attaches an item of type KDBUS_ITEM_CAPS, carrying sets of capabilities + that should be accessed via kdbus_item.caps.caps. Also, userspace should + be written in a way that it takes kdbus_item.caps.last_cap into account, + and derive the number of sets and rows from the item size and the reported + number of valid capability bits. + + KDBUS_ATTACH_SECLABEL + Attaches an item of type KDBUS_ITEM_SECLABEL, which contains the SELinux + security label of the sending task. Access via kdbus_item->str. + + KDBUS_ATTACH_AUDIT + Attaches an item of type KDBUS_ITEM_AUDIT, which contains the audio label + of the sending taskj. Access via kdbus_item->str. + + KDBUS_ATTACH_CONN_NAME + Attaches an item of type KDBUS_ITEM_CONN_NAME that contain's the + sending's connection current name in kdbus_item.str. + + +13.1 Metadata and namespaces +---------------------------- +Note that if the user or PID namespaces of a connection at the time of sending +differ from those that were active then the connection was created +(KDBUS_CMD_HELLO), data structures such as messages will not have any metadata +attached to prevent leaking security-relevant information. + + +14. Error codes +=============================================================================== + +Below is a list of error codes that might be returned by the individual +ioctl commands. The list focuses on the return values from kdbus code itself, +and might not cover those of all kernel internal functions. + +For all ioctls: + + -ENOMEM The kernel memory is exhausted + -ENOTTY Illegal ioctl command issued for the file descriptor + -ENOSYS The requested functionality is not available + +For all ioctls that carry a struct as payload: + + -EFAULT The supplied data pointer was not 64-bit aligned, or was + inaccessible from the kernel side. + -EINVAL The size inside the supplied struct was smaller than expected + -EMSGSIZE The size inside the supplied struct was bigger than expected + -ENAMETOOLONG A supplied name is larger than the allowed maximum size + +For KDBUS_CMD_BUS_MAKE: + + -EINVAL The flags supplied in the kdbus_cmd_make struct are invalid or + the supplied name does not start with the current uid and a '-' + -EEXIST A bus of that name already exists + -ESHUTDOWN The domain for the bus is already shut down + -EMFILE The maximum number of buses for the current user is exhausted + +For KDBUS_CMD_DOMAIN_MAKE: + + -EPERM The calling user does not have CAP_IPC_OWNER set, or + -EINVAL The flags supplied in the kdbus_cmd_make struct are invalid, or + no name supplied for top-level domain + -EEXIST A domain of that name already exists + +For KDBUS_CMD_ENDPOINT_MAKE: + + -EPERM The calling user is not privileged (see Terminology) + -EINVAL The flags supplied in the kdbus_cmd_make struct are invalid + -EEXIST An endpoint of that name already exists + +For KDBUS_CMD_HELLO: + + -EFAULT The supplied pool size was 0 or not a multiple of the page size + -EINVAL The flags supplied in the kdbus_cmd_make struct are invalid, or + an illegal combination of KDBUS_HELLO_MONITOR, + KDBUS_HELLO_ACTIVATOR and KDBUS_HELLO_POLICY_HOLDER was passed + in the flags, or an invalid set of items was supplied + -EPERM An KDBUS_ITEM_CREDS items was supplied, but the current user is + not privileged + -ESHUTDOWN The bus has already been shut down + -EMFILE The maximum number of connection on the bus has been reached + +For KDBUS_CMD_BYEBYE: + + -EALREADY The connection has already been shut down + -EBUSY There are still messages queued up in the connection's pool + +For KDBUS_CMD_MSG_SEND: + + -EOPNOTSUPP The connection is unconnected, or a fd was passed that is + either a kdbus handle itself or a unix domain socket. Both is + currently unsupported. + -EINVAL The submitted payload type is KDBUS_PAYLOAD_KERNEL, + KDBUS_MSG_FLAGS_EXPECT_REPLY was set without a timeout value, + KDBUS_MSG_FLAGS_SYNC_REPLY was set without + KDBUS_MSG_FLAGS_EXPECT_REPLY, an invalid item was supplied, + src_id was != 0 and different from the current connection's ID, + a supplied memfd had a size of 0, a string was not properly + nul-terminated + -ENOTUNIQ KDBUS_MSG_FLAGS_EXPECT_REPLY was set, but the dst_id is set + to KDBUS_DST_ID_BROADCAST + -E2BIG Too many items + -EMSGSIZE A payload vector was too big, and the current user is + unprivileged. + -ENOTUNIQ A fd or memfd payload was passed in a broadcast message, or + a timeout was given for a broadcast message + -EEXIST Multiple KDBUS_ITEM_FDS or KDBUS_ITEM_BLOOM_FILTER, + KDBUS_ITEM_DST_NAME were supplied + -EBADF A memfd item contained an illegal fd + -EMEDIUMTYPE A file descriptor which is not a kdbus memfd was + refused to send as KDBUS_MSG_PAYLOAD_MEMFD. + -EMFILE Too many file descriptors inside a KDBUS_ITEM_FDS + -EBADMSG An item had illegal size, both a dst_id and a + KDBUS_ITEM_DST_NAME was given, or both a name and a bloom + filter was given + -ETXTBSY A kdbus memfd file cannot be sealed or the seal removed, + because it is shared with other processes or still mmap()ed + -ECOMM A peer does not accept the file descriptors addressed to it + -EFAULT The supplied bloom filter size was not 64-bit aligned + -EDOM The supplied bloom filter size did not match the bloom filter + size of the bus + -EDESTADDRREQ dst_id was set to KDBUS_DST_ID_NAME, but no KDBUS_ITEM_DST_NAME + was attached + -ESRCH The name to look up was not found in the name registry + -EADDRNOTAVAIL KDBUS_MSG_FLAGS_NO_AUTO_START was given but the destination + connection is an activator. + -ENXIO The passed numeric destination connection ID couldn't be found, + or is not connected + -ECONNRESET The destination connection is no longer active + -ETIMEDOUT Timeout while synchronously waiting for a reply + -EINTR System call interrupted while synchronously waiting for a reply + -EPIPE When sending a message, a synchronous reply from the receiving + connection was expected but the connection died before + answering + -ECANCELED A synchronous message sending was cancelled + -ENOBUFS Too many pending messages on the receiver side + -EREMCHG Both a well-known name and a unique name (ID) was given, but + the name is not currently owned by that connection. + +For KDBUS_CMD_MSG_RECV: + + -EINVAL Invalid flags or offset + -EAGAIN No message found in the queue + -ENOMSG No message of the requested priority found + +For KDBUS_CMD_MSG_CANCEL: + + -EINVAL Invalid flags + -ENOENT Pending message with the supplied cookie not found + +For KDBUS_CMD_FREE: + + -ENXIO No pool slice found at given offset + -EINVAL Invalid flags provided, the offset is valid, but the user is + not allowed to free the slice. This happens, for example, if + the offset was retrieved with KDBUS_RECV_PEEK. + +For KDBUS_CMD_NAME_ACQUIRE: + + -EINVAL Illegal command flags, illegal name provided, or an activator + tried to acquire a second name + -EPERM Policy prohibited name ownership + -EALREADY Connection already owns that name + -EEXIST The name already exists and can not be taken over + -ECONNRESET The connection was reset during the call + +For KDBUS_CMD_NAME_RELEASE: + + -EINVAL Invalid command flags, or invalid name provided + -ESRCH Name is not found found in the registry + -EADDRINUSE Name is owned by a different connection and can't be released + +For KDBUS_CMD_NAME_LIST: + + -EINVAL Invalid flags + -ENOBUFS No available memory in the connection's pool. + +For KDBUS_CMD_CONN_INFO: + + -EINVAL Invalid flags, or neither an ID nor a name was provided, + or the name is invalid. + -ESRCH Connection lookup by name failed + -ENXIO No connection with the provided number connection ID found + +For KDBUS_CMD_CONN_UPDATE: + + -EINVAL Illegal flags or items + -EOPNOTSUPP Operation not supported by connection. + -E2BIG Too many policy items attached + -EINVAL Wildcards submitted in policy entries, or illegal sequence + of policy items + +For KDBUS_CMD_ENDPOINT_UPDATE: + + -E2BIG Too many policy items attached + -EINVAL Invalid flags, or wildcards submitted in policy entries, + or illegal sequence of policy items + +For KDBUS_CMD_MATCH_ADD: + + -EINVAL Illegal flags or items + -EDOM Illegal bloom filter size + -EMFILE Too many matches for this connection + +For KDBUS_CMD_MATCH_REMOVE: + + -EINVAL Illegal flags + -ENOENT A match entry with the given cookie could not be found. -- 2.1.2 -- To unsubscribe from this list: send the line "unsubscribe linux-api" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html