On Mon, 19 Aug 2024, Christian Brauner wrote: > On Mon, Aug 19, 2024 at 03:20:34PM GMT, NeilBrown wrote: > > I wasn't really sure who to send this too, and get_maintainer.pl > > suggested 132 addresses which seemed excessive. So I've focussed on > > 'sched' maintainers. I'll probably submit individual patches to > > relevant maintainers/lists if I get positive feedback at this level. > > > > This series was motivated by > > > > Commit ed0172af5d6f ("SUNRPC: Fix a race to wake a sync task") > > > > which adds smp_mb__after_atomic(). I thought "any API that requires that > > sort of thing needs to be fixed". > > > > The main patches here are 7 and 8 which revise wake_up_bit and > > wake_up_var respectively. They result in 3 interfaces: > > wake_up_{bit,var} includes smp_mb__after_atomic() > > wake_up_{bit,var}_relaxed() doesn't have a barrier > > wake_up_{bit,var}_mb() includes smb_mb(). > > It's great that this api is brought up because it gives me a reason to > ask a stupid question I've had for a while. > > I want to change the i_state member in struct inode from an unsigned > long to a u32 because really we're wasting 4 bytes on 64 bit that we're > never going to use given how little I_* flags we actually have and I > dislike that we use that vacuous type in a bunch of our structures for > that reason. > > (Together with another 4 byte shrinkage we would get a whopping 8 bytes > back.) > > The problem is that we currently use wait_on_bit() and wake_up_bit() in > various places on i_state and all of these functions require an unsigned > long (probably because some architectures only handle atomic ops on > unsigned long). i_state contains two bits that are used for wake_up - I_NEW and I_SYNC, one virtual bit that is used for wake_up - I_DIO_WAKEUP - and 15 others. You could fit those in a short and two bools which gives you three different addresses to pass to wake_up_var(). Doing that would make it a little difficult to test for I_NEW | I_FREEING | I_WILL_FREE but you could probably make an inline that the compile with optimise effectively. Alternately you could union the "u32" with an array for 4 char to give you 4 addresses for wakeup. Both of these would be effective, but a bit hackish. If you want something clean we would need a new interface. Maybe wake_var_bit()/wait_var_bit_event(). It could store the bit in wait_bit_key.bit_nr as "-2-n" or similar. Or better still, add another field: enum { WAIT_BIT, WAIT_VAR, WAIT_VAR_BIT } wait_type to wait_bit_key. I would probably go with the union approach and re-order the bits so that the ones you want to use for wake_up are less than sizeof(u32). NeilBrown