On 1/9/22 7:38 PM, Ming Lei wrote: > On Sun, Jan 09, 2022 at 06:54:21PM -0700, Jens Axboe wrote: >> On 1/9/22 6:50 PM, Ming Lei wrote: >>> Only the last sbitmap_word can have different depth, and all the others >>> must have same depth of 1U << sb->shift, so not necessary to store it in >>> sbitmap_word, and it can be retrieved easily and efficiently by adding >>> one internal helper of __map_depth(sb, index). >>> >>> Not see performance effect when running iops test on null_blk. >>> >>> This way saves us one cacheline(usually 64 words) per each sbitmap_word. >> >> We probably want to kill the ____cacheline_aligned_in_smp from 'word' as >> well. > > But sbitmap_deferred_clear_bit() is called in put fast path, then the > cacheline becomes shared with get path, and I guess this way isn't > expected. Just from 'word', not from 'cleared'. They will still be in separate cache lines, but usually doesn't make sense to have the leading member marked as cacheline aligned, that's a whole struct property at that point. -- Jens Axboe