Re: 'simple' futex interface [Was: [PATCH v3 1/4] futex: Implement mechanism to wait on any of several futexes]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



* Peter Zijlstra:

> So how about we introduce new syscalls:
>
>   sys_futex_wait(void *uaddr, unsigned long val, unsigned long flags, ktime_t *timo);
>
>   struct futex_wait {
> 	void *uaddr;
> 	unsigned long val;
> 	unsigned long flags;
>   };
>   sys_futex_waitv(struct futex_wait *waiters, unsigned int nr_waiters,
> 		  unsigned long flags, ktime_t *timo);
>
>   sys_futex_wake(void *uaddr, unsigned int nr, unsigned long flags);
>
>   sys_futex_cmp_requeue(void *uaddr1, void *uaddr2, unsigned int nr_wake,
> 			unsigned int nr_requeue, unsigned long cmpval, unsigned long flags);
>
> Where flags:
>
>   - has 2 bits for size: 8,16,32,64
>   - has 2 more bits for size (requeue) ??
>   - has ... bits for clocks
>   - has private/shared
>   - has numa

What's the actual type of *uaddr?  Does it vary by size (which I assume
is in bits?)?  Are there alignment constraints?

These system calls seemed to be type-polymorphic still, which is
problematic for defining a really nice C interface.  I would really like
to have a strongly typed interface for this, with a nice struct futex
wrapper type (even if it means that we need four of them).

Will all architectures support all sizes?  If not, how do we probe which
size/flags combinations are supported?

> For NUMA I propose that when NUMA_FLAG is set, uaddr-4 will be 'int
> node_id', with the following semantics:
>
>  - on WAIT, node_id is read and when 0 <= node_id <= nr_nodes, is
>    directly used to index into per-node hash-tables. When -1, it is
>    replaced by the current node_id and an smp_mb() is issued before we
>    load and compare the @uaddr.
>
>  - on WAKE/REQUEUE, it is an immediate index.

Does this mean the first waiter determines the NUMA index, and all
future waiters use the same chain even if they are on different nodes?

I think documenting this as a node index would be a mistake.  It could
be an arbitrary hint for locating the corresponding kernel data
structures.

> Any invalid value with result in EINVAL.

Using uaddr-4 is slightly tricky with a 64-bit futex value, due to the
need to maintain alignment and avoid padding.

Thanks,
Florian




[Index of Archives]     [Linux Wireless]     [Linux Kernel]     [ATH6KL]     [Linux Bluetooth]     [Linux Netdev]     [Kernel Newbies]     [Share Photos]     [IDE]     [Security]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux ATA RAID]     [Samba]     [Device Mapper]

  Powered by Linux