Hi Alice, Alice Ryhl <aliceryhl@xxxxxxxxxx> writes: > On Wed, 3 May 2023 11:07:03 +0200, Andreas Hindborg <a.hindborg@xxxxxxxxxxx> wrote: >> The kernel `struct spinlock` is 4 bytes on x86 when lockdep is not enabled. The >> structure is not padded to fit a cache line. The effect of this for `SpinLock` >> is that the lock variable and the value protected by the lock will share a cache >> line, depending on the alignment requirements of the protected value. Aligning >> the lock variable and the protected value to a cache line yields a 20% >> performance increase for the Rust null block driver for sequential reads to >> memory backed devices at 6 concurrent readers. >> >> Signed-off-by: Andreas Hindborg <a.hindborg@xxxxxxxxxxx> > > This applies the cacheline padding to all spinlocks unconditionally. > It's not clear to me that we want to do that. Instead, I suggest using > `SpinLock<CachePadded<T>>` in the null block driver to opt-in to the > cache padding there, and let other drivers choose whether or not they > want to cache pad their locks. I was going to write that this is not going to work because the compiler is going to reorder the fields of `Lock` and put the `data` field first, followed by the `state` field. But I checked the layout, and it seems that I actually get the `state` field first (with an alignment of 4), 60 bytes of padding, and then the `data` field (with alignment 64). I am wondering why the compiler is not reordering these fields? Am I guaranteed that the fields will not be reordered? Looking at the definition of `Lock` there does not seem to be anything that prevents rustc from swapping `state` and `data`. > > On Wed, 3 May 2023 11:07:03 +0200, Andreas Hindborg <a.hindborg@xxxxxxxxxxx> wrote: >> diff --git a/rust/kernel/cache_padded.rs b/rust/kernel/cache_padded.rs >> new file mode 100644 >> index 000000000000..758678e71f50 >> --- /dev/null >> +++ b/rust/kernel/cache_padded.rs >> >> +impl<T> CachePadded<T> { >> + /// Pads and aligns a value to 64 bytes. >> + #[inline(always)] >> + pub(crate) const fn new(t: T) -> CachePadded<T> { >> + CachePadded::<T> { value: t } >> + } >> +} > > Please make this `pub` instead of just `pub(crate)`. Other drivers might > want to use this directly. Alright. > > On Wed, 3 May 2023 11:07:03 +0200, Andreas Hindborg <a.hindborg@xxxxxxxxxxx> wrote: >> diff --git a/rust/kernel/sync/lock/spinlock.rs b/rust/kernel/sync/lock/spinlock.rs >> index 979b56464a4e..e39142a8148c 100644 >> --- a/rust/kernel/sync/lock/spinlock.rs >> +++ b/rust/kernel/sync/lock/spinlock.rs >> @@ -100,18 +103,20 @@ unsafe impl super::Backend for SpinLockBackend { >> ) { >> // SAFETY: The safety requirements ensure that `ptr` is valid for writes, and `name` and >> // `key` are valid for read indefinitely. >> - unsafe { bindings::__spin_lock_init(ptr, name, key) } >> + unsafe { bindings::__spin_lock_init((&mut *ptr).deref_mut(), name, key) } >> } >> >> + #[inline(always)] >> unsafe fn lock(ptr: *mut Self::State) -> Self::GuardState { >> // SAFETY: The safety requirements of this function ensure that `ptr` points to valid >> // memory, and that it has been initialised before. >> - unsafe { bindings::spin_lock(ptr) } >> + unsafe { bindings::spin_lock((&mut *ptr).deref_mut()) } >> } >> >> + #[inline(always)] >> unsafe fn unlock(ptr: *mut Self::State, _guard_state: &Self::GuardState) { >> // SAFETY: The safety requirements of this function ensure that `ptr` is valid and that the >> // caller is the owner of the mutex. >> - unsafe { bindings::spin_unlock(ptr) } >> + unsafe { bindings::spin_unlock((&mut *ptr).deref_mut()) } >> } >> } > > I would prefer to remain in pointer-land for the above operations. I > think that this leads to core that is more obviously correct. > > For example: > > ``` > impl<T> CachePadded<T> { > pub const fn raw_get(ptr: *mut Self) -> *mut T { > core::ptr::addr_of_mut!((*ptr).value) > } > } > > #[inline(always)] > unsafe fn unlock(ptr: *mut Self::State, _guard_state: &Self::GuardState) { > unsafe { bindings::spin_unlock(CachePadded::raw_get(ptr)) } > } > ``` Got it 👍 BR Andreas