On Fri, Nov 10, 2023 at 4:10 PM Nhat Pham <nphamcs@xxxxxxxxx> wrote: > > I notice the bool is between two integers. > > mem_cgroup structure has a few bool sprinkle in different locations. > > Arrange them together might save a few padding bytes. We can also > > consider using bit fields. > > It is a very minor point, the condition also exists before your change. > > This sounds like an optimization worthy of its own patch. Two random > thoughts however: Sure. I consider this a very minor point as well. > > a) Can this be done at the compiler level? I believe you can reduce > the padding required by sorting the fields of a struct by its size, correct? > That sounds like a job that a compiler should do for us... According to the C standard, the struct member should be layered out in the order it was declared. There are too many codes that assume the first member has the same address of the struct. Consider we use struct for DMA descriptor as well, where the memory layout needs to match the underlying hardware. Re-ordering the members will be really bad there. There are gcc extensions to do structure member randomization. But the randomization layout is determined by the randomization seed. The compiler actually doesn't have the flexibility to rearrange the member orders to reduce the padding either. > > b) Re: the bitfield idea, some of the fields are CONFIG-dependent (well > like this one). Might be a bit hairier to do it... You can declare the bit field under preprocessor condition as well, just like a normal declare. Can you clarify why it is more hairier? The bitfield does not have a pointer address associated with it, the compiler can actually move the bit field bits around. You get the compiler to do it for you in this case. > > > > > > #endif /* _LINUX_ZSWAP_H */ > > > diff --git a/mm/memcontrol.c b/mm/memcontrol.c > > > index e43b5aba8efc..9cb3ea912cbe 100644 > > > --- a/mm/memcontrol.c > > > +++ b/mm/memcontrol.c > > > @@ -5545,6 +5545,11 @@ mem_cgroup_css_alloc(struct cgroup_subsys_state *parent_css) > > > WRITE_ONCE(memcg->soft_limit, PAGE_COUNTER_MAX); > > > #if defined(CONFIG_MEMCG_KMEM) && defined(CONFIG_ZSWAP) > > > memcg->zswap_max = PAGE_COUNTER_MAX; > > > + > > > + if (parent) > > > + WRITE_ONCE(memcg->zswap_writeback, READ_ONCE(parent->zswap_writeback)); > > > + else > > > + WRITE_ONCE(memcg->zswap_writeback, true); > > > > You can combine this two WRITE_ONCE to one > > > > bool writeback = !parent || READ_ONCE(parent->zswap_writeback); > > WRITE_ONCE(memcg->zswap_writeback, writeback); > > > > Yeah I originally did something similar, but then decided to do the if-else > instead. Honest no strong preference here - just felt that the if-else was > cleaner at that moment. One WRITE_ONCE will produce slightly better machine code as less memory store instructions. Normally the compiler is allowed to do the common expression elimination to merge the write. However here it has explicite WRITE_ONCE, so the compiler has to place two memory stores instructions, because you have two WRITE_ONCE. My suggestion will only have one memory store instruction. I agree it is micro optimization. Chris