Richard Guy Briggs <rgb@xxxxxxxxxx> writes: > On 14/08/23, Eric W. Biederman wrote: >> Richard Guy Briggs <rgb@xxxxxxxxxx> writes: >> >> > Generate and assign a serial number per namespace instance since boot. >> > >> > Use a serial number per namespace (unique across one boot of one kernel) >> > instead of the inode number (which is claimed to have had the right to change >> > reserved and is not necessarily unique if there is more than one proc fs) to >> > uniquely identify it per kernel boot. >> >> This approach is just broken. >> >> For this to work with migration (aka criu) you need to implement a >> namespace of namespaces. You haven't done this, and therefore >> such an interface will break existing userspace. >> >> Inside of audit I can understand not caring about these issues, >> but you go foward and expose these serial numbers in proc, >> and generally make this infrastructure available to others. >> >> The deep issue with migration is that we move tasks from one machine >> from another and on the destination machine we need to have all of the >> same global identifiers for software to function properly. >> >> My weasel words around the proc inode numbers is to preserve to allow us >> room to be able to restore those ids if it every becomes relevant for >> migration. > > What do you do if the inode number is already in use on the target > host? Since the inode numbers are relative to a superblock or a pid namespace the numbers that are in use can be restored on the target system by creating them in the appropriate namespace. The support does not exist in the kernel today for doing that because no one has cared but as architected the support can be added if needed to support migration. >> That is the proc inode numbers (technically) live in a pid namespace, >> (aka a mount of proc). So depending on the pid namespace you are in >> or the mount of proc you look in the numbers could change. >> >> Qualifications like that must exist to have a prayer of ever supporting >> process migration in the crazy corner cases where people start caring >> about inode numbers. >> >> We currently don't and inode numbers for a namespace will never change >> after a namespace is created. So I think you really are ok using the >> proc inode numbers. I am happy declaring by fiat that the inode numbers >> that audit uses are the numbers connected to the initial pid namespace. > > But once a namespace/container is migrated, it is a different audit that > is looking at it (unless we create an audit manager or entity that > functions at the level of a container manager), so audit should not care. These numbers were exported to everyone as a general purpose facility in proc. If audit is global and audit doesn't migrate you are right it doesn't matter. However if these numbers are used by anyone else for anything else it causes a problem. Further given that people run entire distributions in containers we may reach the point where we wish to run auditd in a container in the future. I would hate to paint ourselves into a corner with a design that could never allow audit to migrate. Support that case someday seems a valid naive desire. >> At a fairly basic level anything that is used to identify namespaces for >> any general purpose use needs to have most if not all of the same >> properties of the proc inode numbers. The most important of which is >> being tied to some context/namespace so there is a ability if we ever >> need it to migrate those numbers from one machine to another. > > Sooo... does it make any sense to have those inode or serial numbers be > blank inside the namespace/container itself, but only visible to its > manager outside the container (unless it is the initial namespace)? Mostly I think it makes sense to use the inode numbers from the initial pid namespace. They already exist. They already are unique. (Which means I don't need to maintain more code and more special cases). And the do what you need now. I probably haven't followed closely enough but I don't see what makes inode numbers undesirable. Eric -- To unsubscribe from this list: send the line "unsubscribe linux-api" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html