On Sun, May 12, 2019 at 9:22 PM Matthew Wilcox <willy@xxxxxxxxxxxxx> wrote: > > On Sun, May 12, 2019 at 09:56:47AM -0500, Shawn Landden wrote: > > I am trying to implement epochs for pids. For this I need to allow > > radix tree operations to be specified COW (deletion does not need to > > change). Radix > > trees look like they are under alot of work by you, so how can I best > > get this feature, and have some code I can work with to write my > > feature? > > Hi Shawn, > > I'd love to help, but I don't quite understand what you want. > > Here's the conversion of the PID allocator from the IDR to the XArray: > > http://git.infradead.org/users/willy/linux-dax.git/commitdiff/223ad3ae5dfffdfc5642b1ce54df2c7836b57ef1 > > What semantics do you want to change? When allocating a pid, you pass an epoch number. If the pids being allocated wrap, then the epoch is incremented, and a new radix tree created that is COW of the last epoch. If the page that is found for allocation is of an older epoch, it is copied and the allocation only happens in the copy. On freeing a pid, there a single radix-tree bit for every still-active epoch that is set to indicate that this slot has expired. This will be used for the (new) waitpidv syscall, which can provide all the functionality of wait4() and more, and allows process to synchronize their references to the current epoch. The current versions of the pid syscalls will continue to operate with the same existing racy semantics. New pid syscalls will be added that take an epoch argument. A current pid epoch u32 is added to task_sched, that reset on fork() when a new process is allocated, then a new pid is allocated, and the epoch has a prctl setter and getter. If a syscall comes in with and the epoch passed is not current AND has passed the pid of the process (this is not a lock, because we current and previous epochs are always available), then it might fail with EEPOCH, the caller then has to call a new syscall, waitpidv(pidv *pid_t, epoch, O_NONBLOCK) providing a list of pids it has references to in a specific epoch, and it gets back a list of which processes have excited. The epoch of a process is always relative to it's pid (not thread-id), so the same epoch number can mean differn't things in differn't places. The process can then invalidate its own internal pids and use ptctl to indicate it doesn't need the old epoch. Processes also get a signal if they haven't updated and are 2 full epochs behind. Being behind should also could against a process in kernel memory accounting. I am sure there is much more to consider....