On Sat, Nov 11, 2023 at 10:22 AM Chris Li <chrisl@xxxxxxxxxx> wrote: > > On Fri, Nov 10, 2023 at 4:10 PM Nhat Pham <nphamcs@xxxxxxxxx> wrote: > > > I notice the bool is between two integers. > > > mem_cgroup structure has a few bool sprinkle in different locations. > > > Arrange them together might save a few padding bytes. We can also > > > consider using bit fields. > > > It is a very minor point, the condition also exists before your change. > > > > This sounds like an optimization worthy of its own patch. Two random > > thoughts however: > > Sure. I consider this a very minor point as well. > > > > > a) Can this be done at the compiler level? I believe you can reduce > > the padding required by sorting the fields of a struct by its size, correct? > > That sounds like a job that a compiler should do for us... > > According to the C standard, the struct member should be layered out > in the order it was declared. There are too many codes that assume the > first member has the same address of the struct. Consider we use > struct for DMA descriptor as well, where the memory layout needs to > match the underlying hardware. Re-ordering the members will be really > bad there. There are gcc extensions to do structure member > randomization. But the randomization layout is determined by the > randomization seed. The compiler actually doesn't have the flexibility > to rearrange the member orders to reduce the padding either. > Ah I see. Yeah then it might be worth tweaking around manually. But yeah, we should do this separately from this patch. > > > > b) Re: the bitfield idea, some of the fields are CONFIG-dependent (well > > like this one). Might be a bit hairier to do it... > > You can declare the bit field under preprocessor condition as well, > just like a normal declare. Can you clarify why it is more hairier? > The bitfield does not have a pointer address associated with it, the > compiler can actually move the bit field bits around. You get the > compiler to do it for you in this case. I see hmmm. > > > > > > > > > > #endif /* _LINUX_ZSWAP_H */ > > > > diff --git a/mm/memcontrol.c b/mm/memcontrol.c > > > > index e43b5aba8efc..9cb3ea912cbe 100644 > > > > --- a/mm/memcontrol.c > > > > +++ b/mm/memcontrol.c > > > > @@ -5545,6 +5545,11 @@ mem_cgroup_css_alloc(struct cgroup_subsys_state *parent_css) > > > > WRITE_ONCE(memcg->soft_limit, PAGE_COUNTER_MAX); > > > > #if defined(CONFIG_MEMCG_KMEM) && defined(CONFIG_ZSWAP) > > > > memcg->zswap_max = PAGE_COUNTER_MAX; > > > > + > > > > + if (parent) > > > > + WRITE_ONCE(memcg->zswap_writeback, READ_ONCE(parent->zswap_writeback)); > > > > + else > > > > + WRITE_ONCE(memcg->zswap_writeback, true); > > > > > > You can combine this two WRITE_ONCE to one > > > > > > bool writeback = !parent || READ_ONCE(parent->zswap_writeback); > > > WRITE_ONCE(memcg->zswap_writeback, writeback); > > > > > > > Yeah I originally did something similar, but then decided to do the if-else > > instead. Honest no strong preference here - just felt that the if-else was > > cleaner at that moment. > > One WRITE_ONCE will produce slightly better machine code as less > memory store instructions. Normally the compiler is allowed to do the > common expression elimination to merge the write. However here it has > explicite WRITE_ONCE, so the compiler has to place two memory stores > instructions, because you have two WRITE_ONCE. My suggestion will only > have one memory store instruction. I agree it is micro optimization. > Ohh I did not think about this. Seems like my original version was more than just a clever one-liner haha. It's a bit of a micro-optimization indeed. But if for some reason I need to send v5 or a fixlet, I'll keep this in mind! Thanks for the explanation, Chris! > Chris