Re: [PATCHES][RFC] packing struct block_device flags

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, Apr 29, 2024 at 07:23:15AM +0200, Christoph Hellwig wrote:
> On Sun, Apr 28, 2024 at 06:12:32AM +0100, Al Viro wrote:
> > 	We have several bool members in struct block_device.
> > It would be nice to pack that stuff, preferably without
> > blowing the cacheline boundaries.
> > 
> > 	That had been suggested a while ago, and initial
> > implementation by Yu Kuai <yukuai1@xxxxxxxxxxxxxxx> ran into
> > objections re atomicity, along with the obvious suggestion to
> > use unsigned long and test_bit/set_bit/clear_bit for access.
> > Unfortunately, that *does* blow the cacheline boundaries.
> > 
> > 	However, it's not hard to do that without bitops;
> > we have an 8-bit assign-once partition number nearby, and
> > folding it into a 32-bit member leaves us up to 24 bits for
> > flags.  Using cmpxchg for setting/clearing flags is not
> > hard, and 32bit cmpxchg is supported everywhere.
> 
> To me this looks pretty complicated and arcane.

cmpxchg for setting/clearing a bit in u32 is hardly arcane,
especially since these bits are rarely modified.  _Reading_
is done on hot paths, but that's cheap.

For more familiar form,
static inline void bdev_set_flag(struct block_device *bdev, int flag)
{
	u32 *p = &bdev->__bd_flags;
        u32 c, old;

	c = *p;
	while ((old = cmpxchg(p, c, c | (1 << (flag + 8)))) != c)
		c = old;
}

to keep it closer to e.g. generic_atomic_or() and its ilk (asm-generic/atomic.h)
or any number of similar places.

FWIW, we could go for atomic_t there and use
	atomic_read() & 0xff
for partno, with atomic_or()/atomic_and() for set/clear and
atomic_read() & constant for test.  That might slightly optimize
set/clear on some architectures, but setting/clearing flags is
nowhere near hot enough for that to make a difference.

> How about we just
> reclaim a little space in the block device and just keep bd_partno
> and add an unsigned long base flags?
> 
> E.g. bd_meta_info is only used in slow path early boot and sysfs code
> and just set for a few partitions.
> 
> Just turn it into a hash/xrray/whatever indexed by bd_dev and we've
> already reclaimed enough space.

The problem is not the total size, though - it's blowing the first
cacheline.  With struct device shoved into that thing, a word or two
in total size is noise; keeping the fields read on the hot paths within
the first 64 bytes is not.




[Index of Archives]     [Linux RAID]     [Linux SCSI]     [Linux ATA RAID]     [IDE]     [Linux Wireless]     [Linux Kernel]     [ATH6KL]     [Linux Bluetooth]     [Linux Netdev]     [Kernel Newbies]     [Security]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Device Mapper]

  Powered by Linux